perm filename KNOWL.TEX[AM,DBL]2 blob
sn#398086 filedate 1978-11-27 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00026 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00003 00002 \NSECP{CONCEPTS}
C00008 00003 \SSEC{Motivation and Overview}
C00011 00004 \SSSEC{A Glimpse of a Typical Concept}
C00019 00005 \SSSEC{The main constraint: Fixed set of facets}
C00029 00006 \SSSEC{BEINGs Representation of Knowledge}
C00033 00007 \SSEC{Facets}
C00039 00008 \SSSEC{Generalizations/Specializations}
C00050 00009 \SSSEC{Examples/Isa's}
C00065 00010 \SSSEC{In-Domain-of/In-Range-of}
C00080 00011 \SSSEC{Views}
C00091 00012 \SSSEC{Intuitions}
C00100 00013 \SSSEC{Analogies}
C00110 00014 \SSSEC{Conjec's}
C00130 00015 \SSSEC{Definitions}
C00147 00016 \SSSEC{Algorithms}
C00165 00017 \SSSEC{Domain/Range}
C00179 00018 \SSSEC{Worth}
C00188 00019 \SSSEC{Interest}
C00203 00020 \SSSEC{Suggest}
C00212 00021 \SSSEC{Fill/Check}
C00224 00022 \SSSEC{Other Facets which were Considered}
C00231 00023 \SSEC{AM's Starting Concepts}
C00233 00024 \SSSEC{Diagram of Initial Concepts}
C00237 00025 \SSSEC{Summary of Initial Concepts}
C00264 00026 \SSSEC{Rationale behind Choice of Concepts}
C00273 ENDMK
C⊗;
\NSECP{CONCEPTS}
This chapter contains material about AM's anatomy:
the knowledge AM starts with, and how that knowledge is represented.
\SSEC{Motivation and Overview}
Each concept consists merely of a bundle of facets. The facets
represent the different aspects of each concept, the kinds of
questions one might want to ask about the concept:
\yskip
\parindent 0pt
$\bullet $ \ How valuable is this concept?
$\bullet $ \ What is its definition?
$\bullet $ \ If it's an operation, what is legally in its domain?
$\bullet $ \ What are some generalizations of this concept?
$\bullet $ \ How can you separate the interesting instances of this concept from
the dull ones?
\parindent 19pt
\vdots
\yyskip
\noindent Since each concept represents a mathematical entity, the kinds of questions
one might ask are fairly constant from concept to concept. This set
of questions might change somewhat for a new domain of concept.
One ``natural" representation for a concept in LISP is as a
set of attri\-bute/val\-ue pairs. That is, each concept is maintained as
an atom with a property list. The {\sl names} of the properties (Worth,
Definitions, Domain/Range, Generalizations, Interestingness, etc.)
correspond to the questions above, and the {\sl value} stored under
property F of atom C is simply the value of the F-facet of the C-concept.
This value can also be viewed as the answer which expert C would give, if asked
question F. Or, it can be viewed as the
contents of slot F of frame C.
\SSSEC{A Glimpse of a Typical Concept}
As an example, here is a stylized rendition of the SETS concept. This
is a concept which is meant to correspond to the notion of a set of
elements.
For example, according to
the concept shown below, ``Singleton" is one entry on the Specializations facet
of Sets --- i.e., singletons are specific kinds of sets..
\yskip
\2
\han2{Name(s): {\it Set, Class, Collection}}
\han2{Definitions: {\it }}
\han3{Recursive: {\it $\lambda $ (S) [S=$\{\}$ or Set.Definition (Remove(Any-member(S),S))] }}
\han3{ Recursive quick: {\it $\lambda $ (S) [S=$\{\}$ or Set.Definition (CDR(S))] }}
\han3{ Quick: {\it $\lambda $ (S) [Match S with $\{\ldots \}$ ] }}
\han2{Specializations: {\it Empty-set, Nonempty-set, Set-of-structures, Singleton }}
\han2{Generalizations: {\it Unordered-Structure, No-multiple-elements-Structure }}
\han2{Examples: {\it }}
\han3{Typical: {\it $\{\{\}\},\ \{A\},\ \{A,B\},\ \{3\}$ }}
\han3{Barely: {\it $\{\}, \ \{A, B, \{C, \{ \{ \{ A, C, (3,3,3,9), <4,1,A,\{B\},A>\}\}\}\}\}$}}
\han3{ Not-quite: {\it $\{A,A\}, (), \{B,A\}$}}
\han3{ Foible: {\it <4,1,A,1> }}
\han2{Conjec's: {\it All unordered-structures are sets. }}
\han2{Intu's: {\it }}
\han3{ Geometric: {\it Venn diagram.\foo{ See [Venn 89], or [Skemp 71]. } }}
\han2{Analogues: {\it Bag, List, Oset }}
\han2{Worth: {\it 600 }}
\han2{View: {\it }}
\han3{ Predicate: {\it $\lambda (P) \{x\,\epsilon \, Domain(P)\ \relv \ P(x)\}$ }}
\han3{ Structure: {\it $\lambda $(S)\ Enclose-in-braces(Sort(Remove-multiple-elements(S))) }}
\han2{Suggest: {\it If P is an interesting predicate over X,
consider $\{x\,\epsilon \,X \ \relv\ P(x)\}$.}}
\han2{In-domain-of: {\it Union, Intersection, Set-difference, Set-equality, Subset, Member }}
\han3{In-range-of: {\it Union, Intersection, Set-difference, Satisfying }}
\yskip
\noindent \1 To decipher the Definitions facet,
there are a few things you must know.
An expression of the form ``($\lambda $ (x) E)" is called a Lambda expression
after Church\foo{
Before and during Church, it's called a function. See [Church 41]. },
and may be considered an executable procedure.
It accepts one argument, binds the variable ``x"
to the value of that argument, and then
evaluates ``E" (which is probably some expression involving the variable x).
For example, ``($\lambda $ (x) (x+5))" is a function which
adds 5 to any number; if given the argument 3, this lambda expression will return
the value 8.
The second thing you must know is that facet F of concept C will occasionally
be abbreviated as C.F. In those cases where F is ``executable", the notation
C.F will refer to applying the corresponding function.
So the first entry in the Definitions facet is recursive because it contains
an embedded call on the function Set.Definition.
Notice that we are implying that the \4name\0
of that lambda expression itself is ``Set.Definition".
There are some unusual implications of this: since there are three separate but
equivalent definitions, AM may choose whichever one it wants when it recurs.
AM can choose one via a random selection scheme, or always try to recur into the
same definition as it was just in, or perhaps suit its choice to the form of
the argument at the moment.
For example, one definition might be great for arguments
of size 10 or less, but slow for bigger ones, and another definition might be
mediocre for all size arguments; then AM should
use the mediocre definition over and over again, until the
argument becomes small enough,
and from then on recur only into the fast definition.
Although AM embodies this ``smart" scheme, the little comments necessary to
see how it does so have be excised from the version shown above;
this will be detailed later in this chapter, in section 2.5.1.8.
All concepts possess executable definitions, though not necessarily effective
ones. They each have a LISP predicate, but that predicate is not guaranteed
to terminate. Notice that the definitions for Sets are all definitions of
finite sets.
\SSSEC{The main constraint: Fixed set of facets}
One important constraint on the representation is that the set of
facets be fixed for all the concepts.
An additional constraint is that this set of facets not grow, that it be
fixed once and for all.
So there is one unchanging, universal list
of two dozen types of facets. Every facet of every concept
\4must\0 have one of those standard names. All concepts which have
some examples must store them as entries on a facet called Examples;
they can't call them Instances, or Cases, or G00037's. This
constraint is known as the ``\6Beings\0 constraint"\foo{ See [Lenat 75b].
Historically, each concept module was called a ``BEING". }, and has
three important consequences:
1. OUTLINE: First, it provides a nice, distributed, universal
framework on which to display all that is known about a given
concept.
AM can instantly tell what facets are not yet
filled in for any given concept, and this will in turn suggest new
tasks to perform. In other words, this constraint helps define the
``space" which AM must explore, and makes it obvious what parts of
each concept have and have not yet been investigated.
2. STRUCTURE: The constraint specifies that there be a \4set\0 of
facets, not just one. This set was made large enough that all the
efficiency advantages of a ``structured" representation are preserved
(unlike totally uniform representations, e.g. pure production systems
with simple memories as data structures, or predicate calculus).
3. UNIFORMITY: When AM wishes a piece of information, it must ask a
concept a ``question" --- i.e., mention a particular facet by name.
The benefit of the \6Beings\0 constraint is that there is a fixed,
small repertoire of questions it may ask, hence there will be
no long searching, no misunderstandings. This is the same advantage
that always accrues when everyone uses a common language.
We shall illustrate the last two advantages by using the Sets concept
pictured above. How does AM handle a task of
this form: ``\6Check examples of Sets\1''? AM accesses the Examples
facet of the Sets concept, and obtains a bunch of items which are all
probably sets. If any isn't a set, AM would like to make it one, if
that involves nothing difficult. AM locates all the generalizations
of Sets\foo{ by ``rippling" upward from Sets, in the Genl direction}, and
comes up with the list <Sets, Unordered-Structures,
No-multiple-elements-Structures, Structures, Objects, Any-concept,
Anything>. Next, the ``Check" facet of each of these is examined, and
all its heuristics are collected. For example, the Check facet of
the No-multiple-elements-Structures concept contains the following
entry: ``Eliminate multiple occurrences of each element" (of course
this is present not as an English sentence but rather as a little
LISP function). So even though Sets has no entries for its Check
facet, several little functions will be gathered up by the rippling
process. Each potential set would be subjected to all those checks,
and might be modified or discarded as a result.
There is enough ``structure" around to keep the heuristic rules
relevant to this task isolated from very irrelevant rules, and there
is enough ``uniformity" to make finding those rules very easy.
The same rippling would be done to find predicates which tell whether
a set is interesting or dull. For example, one entry on the
Interestingness facet of the Structure concept says that a structure
is interesting if all pairs of members satisfy the same rare
predicate P(x,y) [for any such P]. So a set, all pairs of whose
members satisfy ``Equality," would be considered interesting. In fact,
every Singleton is an interesting \4Structure\0 for just that reason.
A singleton might be an interesting \4Anything\0 because it takes only
a few characters to type it out (thereby satisfying a criterion on
Anything.Interest).
To locate all the specializations of Sets, the rippling would go in
the opposite direction. For example, one of the entries on the
Specializations facet of Sets is Set-of-structures; one if \4its\0
Specialization entries is Set-of-sets. So this latter concept will be
caught in the net when rippling away from Sets in the Specializations
direction.
If AM wants lots of examples of sets, it has only to ripple in the
Specializations direction, gathering Examples of each concept it
encounters. Examples of Sets-of-sets (like this one: $\ \{\{A\},\{\{C,D\}\}\}$)
will be caught in this way, as will examples of Sets-of-numbers (like
this one: $\ \{$1,4,5$\}$), because two specializations of Sets are
Sets-of-Sets and Sets-of-Numbers\foo{ We are assuming that AM has run
for some time, and already discovered Numbers, and already defined
Sets-of-Numbers. }.
In addition to the three main reasons for keeping the set of facets the same
for all the concepts (see previous page), we claimed there were also reasons
for keeping that set
fixed once and for all.
Why not dynamically enlarge it?
To add a new facet, its value has to be filled in for lots of concepts.
How could AM develop the huge body of heuristics needed to guide such
filling-in and checking activities? Also, the number of facets is small
to begin with because people don't seem to use more than a few tens of
such ``properties" in classifying knowledge about a concept\foo{ This data
is gathered from introspection by myself and a few others, and should probably
be tested by performing some psychological experiments. }.
If the viability of AM seemed to depend on this ability, I would have worked
on it. AM got along fine without being able to enlarge its set of facets, so
no time was ever spent on that problem.
I leave it as a challenging, ambitious ``open research problem".
\SSSEC{BEINGs Representation of Knowledge}
Before discussing each facet in detail, let's interject a brief
historic digression, to explain the origins of this modular
representation scheme.
The ideas arose in an automatic programming context, while working
out a solution to the problem of constructing a computer system
capable of synthesizing a simple concept-discrimination program
(similar to [Winston 70]). The scenario envisioned was one of mutual
cooperation among a group of a hundred or so experts, each a
specialist in some minute detail of coding, concept formation,
debugging, communicating, etc. Each expert was modelled by one
module, one BEING. Each BEING had the same kinds of slots (parts,
facets), and each slot was interpreted as a \4question\0 which that
BEING could answer. The community of experts carried on a
round-table discussion of a programming task which was specified by a
human user. Eventually, by cooperating and answering each other's
questions, they hammered out the program he desired. See [Lenat
75b] for details.
The final system, called PUP6, did actually synthesize several large
LISP programs, including many variants of the concept-learning
program. This is described in detail in [Lenat 75a]. Unfortunately,
PUP6 had virtually no natural language ability and was therefore
unusable by an untrained human. Its modal output was ``\4Eh?\1''.
The search for a new problem domain where this communication
difficulty wouldn't be so severe led to consideration of elementary
mathematics.
The other main defect of PUP6 was its narrowness, the small range of
`target' programs which could be synthesized. PUP6 had been designed
with just one target in mind, and almost all it could do was to hit
that target. The second constraint on the new task domain was then
one of having a non-specific target, a very broad or diffuse goal.
This pointed to an automated researcher, rather than a
problem-solver.
These two constraints then were (i) elementary math, because of
communication ease, and (ii) self-guided exploration, because of the
danger of too specific a goal. Together, they directed the author to
an investigation which ultimately resulted in the AM project.
\SSEC{Facets}
How \4is\0 each concept
represented?
Without claiming that this is the ``best" or preferred scheme, this section
will treat in detail AM's representation of this knowledge.
We have seen that the representation of a concept can loosely be
described as a collection of facet/value pairs, where the facets are
drawn from a fixed set of about 25 total possible facets.
The facets break down into three categories:
\noindent $\bullet $ \ Facets which relate this concept C to some other one(s):
Generalizations, Specializations, Examples, Isa's, In-domain-of,
In-range-of, Views, Intu's, Ana\-lo\-gies, Conjec's
\noindent $\bullet $ \ Facets which contain information intensive to this concept C itself:
Definitions, Algorithms, Domain/Range, Worth, Interest
\noindent $\bullet $ \ Sub-facets, containing heuristics, which can be tacked onto facets from either
group above. These include: Suggest, Fillin, Check (and might be extended
to include Justification, Origin, and other fields)
\yskip
Some facets come in several flavors (e.g., there are really four
separate facets --- not just one --- which point to Examples: boundary,
typical, just-barely-failing, foibles).
This section will cover each facet in turn. Let's begin by listing
each of them. For a change of pace, we'll show a typical question
that each one might answer about concept C:\foo{ In this discussion, ``C"
represents the name of the concept whose facet is being discussed,
and may be read ``the given concept". }
\yskip
\hangindent 30pt Name: What shall we call C when communicating with the user?
\hangindent 30pt Generalizations: Which other concepts have less restrictive definitions than C?
\hangindent 30pt Specializations: Which concepts have C's definition plus some additional
constraints?
\hangindent 30pt Examples: What are some things that satisfy C's definition?
\hangindent 30pt Isa's: Which concepts' definitions does C itself satisfy?\foo{ Notice that C will
therefore be an Example of each member of Isa's(C). }
\hangindent 30pt In-domain-of: Which operations can be performed on C's?
\hangindent 30pt In-range-of: Which operations result in values which are C's?
\hangindent 30pt Views: How can we view some other kind of entity as if it were a C?
\hangindent 30pt Intu's: What is an abstract, analogic representation for C?
\hangindent 30pt Analogies: Are there similar (though formally unrelated) concepts?
\hangindent 30pt Conjec's: What are some potential theorems involving C?
\hangindent 30pt Definitions: How can we tell if x is an example of C?
\hangindent 30pt Algorithms: How can we execute the operation C on a given argument?
\hangindent 30pt Domain/Range: What kinds of arguments can operation C be executed on? What kinds of
values will it return?
\hangindent 30pt Worth: How valuable is C? (overall, aesthetic, utility, etc.)
\hangindent 30pt Interestingness: What special features make a C especially interesting?
\yyskip
\noindent In addition, each facet F of concept C can possess a few little subfacets which
contain heuristics for dealing with that facet of C's:
\yskip
\hangindent 30pt C.F.Fillin: How can entries on C.F be filled in?
These heuristics get called on when the current task is \6``Fillin
facet F of concept X"\1, where X is a C.
\hangindent 30pt C.F.Check: How can potential entries on C.F be checked and patched up?
\hangindent 30pt C.F.Suggest: If AM gets bogged down, what are some new tasks (related to C.F)
it might consider?
\SSSEC{Generalizations/Specializations}
\qq{Generalization makes possible conscious, controlled, and accurate accomodation
of one's existing schemas, not only in response to the demands for assimilation
of new situations as they are encountered, but {\sl ahead} of these demands,
seeking or creating new examples to fit the enlarged concept.}{Skemp}
We say concept A ``\4is a generalization of\1'' concept B iff
every example of B is an example of A. Equivalently, this is true iff
the definition of B can be phrased as ``$\lambda $ (x) [A.Defn(x) {\it and} P(x)]'';
that is, for x to satisfy B's definition, it must satisfy A's definition plus
some additional predicate P.
The Generalizations facet of concept C will be abbreviated as C.Genl.
C.Genl does not contain \4all\0 generalizations of
C; rather, just the ``immediate" ones. More formally, if A is a generalization
of B, and B of C, then C.Genl will \4not\foo{The
reason for these strange constraints is so that the total number of links
can be minimized. There is no harm if a few redundant ones sneak in.
In fact, frequently-used paths are granted the status of single links, as we
shall soon see.} \0 contain a pointer to A.
Instead, C.Genl will point to B, and B.Genl will point to A.
\yskip
\noindent
Here are the recursive equations which permit a search process to quickly find
all generalizations or specializations of a given concept X:
Generalizations({\it X})
$ =↓{df} \ Genl↑*(X) =↓{df} \ \{X\} \union Generalizations(X.Genl) $
Specializations({\it X})
$ =↓{df} \ Spec↑*(X) =↓{df} \ \{X\} \union Specializations(X.Spec) $
\yskip
\noindent For the reader's convenience, here are the
similar equations, presented elsewhere in the text,
for finding all examples of --- and Isa's of --- X:
\yskip
Examples({\it X}) $ =↓{df} \ Spec↑*(Exs(Spec↑*(X))) $
Isa's({\it X}) $ =↓{df} \ Genl↑*(Isa(Genl↑*(X))) $
\yyskip
\noindent The {\it format}
of the Generalizations facet is quite simple: it is a list of
concept names. The Generalizations facet for Odd-primes might be:
(Odd-numbers Primes)
We've been talking about both Specializations and Generalizations as if they were
very similar to each other. It's time to make that more explicit:
Specializations are the converse of Generalizations. The format is the same, and
(hopefully) A is an entry on B's Specializations facet iff B is an entry on
A's Generalizations facet.
The uses of these two facets are many:
\noindent $\bullet $\ AM can sometimes establish independently that
A is both a generalization and
a specialization of B; in that case, AM would like to
recognize that fact easily, so it can
conjecture that A and B specify
equivalent concepts. Such coincidences are easily detected as {\it cycles}
in the
graph of all Genl (or Spec) relations known to AM.
In these cases, AM may physically merge A and B
(and all the other concepts in the cycle)
into one concept.
\noindent $\bullet $\ Sometimes, AM
wants to assemble a list of all specializations (or
generalizations) of X, so that
it can test whether some statement which is just barely true (or false) for X
will hold for any of those specializations of X.
\noindent $\bullet $\ Sometimes,
the list of generalizations is used to assemble a list of
isa's; similarly, the list of specializations helps assemble a list of examples.
\noindent $\bullet $\ A common and crucial use of the list of
generalizations is to locate all the
heuristic rules which are relevant to a given concept. Typically, the relevant
rules are those tacked onto Isa's of that concept, and the list of Isa's is built up
from the list of generalizations of that concept.
\noindent $\bullet $\ To incorporate new knowledge. If AM learns,
conjectures, etc. that A is a
specialization of B, then all the machinery (all the theorems, algorithms, etc.)
for B become available for working with A.
\yyskip
Here is a little trick that
deserves a couple paragraphs of its own. AM stores the answers to common questions
(like ``What are \4all\0 the specializations of Operation") explicitly,
by intentionally permitting redundant links to be maintained.
If two requests arrive closely in time, to test whether A is a generalization
of B, then the result is stored by adding ``A"
as an entry on the Generalizations facet of
B, and adding ``B" as a new entry on
the Specializations facet of A.
The slight extra space is more than recompensed in cpu time saved.
Computer scientists will perceive the analogy between this redundant
storage (to save later re-computation) and the well-known hardware technique of
{it caching}.
If the result were False (A turned out not to be a generalization of B) then the
links specify that finding explicitly,
so that the next request does not generate a long search again.
Such failures are recorded on two additional facets: Genl-not and Spec-not.
Since {\sl most} concept pairs A/B are related by Spec-not and by Genl-not,
the only entries which get recorded here are the ones which were frequently
called for by AM. If space ever gets tight, all such redundant links can be
discarded with no permanent damage done.
These two ``shadow" facets (Genl-not and Spec-not) are not useful or interesting
in their own right.
If AM ever wished to know all
the concepts which are \4not\0 generalizations of C, the fastest way would
be to take the set-difference of $\{$all concepts$\}$ and Generalizations(C).
Since they are quite incomplete, Genl-not and Spec-not are used more like a cache
memory: they save time whenever they are applicable, and don't really cost
much when they aren't applicable.
\SSSEC{Examples/Isa's}
\qq{Usually, to show that a definition implies no contradiction, we
proceed {\sl by example}, we try to make an {\sl example} of a thing
satisfying the definition. We wish to define a notion A, and we say
that, by definition, an A is anything for which certain postulates
are true. If we can demonstrate directly that all these postulates are
true of a certain object B, the definition will be justified; the
object B will be an {\sl example} of an A.}{Poincar\'e}
Following Poincar\'e, we say ``\4concept A is an example of concept B\1'' iff
A satisfies B's definition.\foo{
What does this mean? B.Defn is a Lisp predicate, a Lambda expression.
If it is fed A as its argument, and it returns True, we say that
A is a B, or that A satisfies B's definition. If B.Defn returns NIL,
we say that A is not a B, or that A fails B's definition. If B.Defn
runs out of time before returning a T/NIL value, there is no definite
statement of this form we can make.
} Equivalently, we say that ``\4A isa B\1''.
It would be legal (in that situation) for ``A" to be an entry on
B.Exs (the Examples facet of concept B) and for ``B" to be an entry on
A.Isa (the Isa's facet of concept A).
The Examples facet of C does not contain \4all\0 examples of
C; rather, just the ``immediate" ones.
The examples facet of Numbers will not contain ``11" since it
is contained in the examples facet of Odd-primes.
A ``rippling" procedure is used to acquire a list of all examples of a given
concept. The basic equation is:
\noindent Examples({\it X})
$ =↓{df} \ Specializations(Exs(Specializations(X))) $
where Exs(x) is the contents of the examples facet of x.
Examples(x) represents the final list of all known items which satisfy
the definition of X. Examples(x) thus must include Exs(x).
Specializations(x) = Spec$↑*$(x); i.e.,
all members of x.Spec, all members of \4their\0 Spec facet, etc.
Note the similarity of this to the formula for Isa's(x), given on the last
page.
As with Generalizations/Specializations, the reasons behind the
incomplete pointer structure is simply to save space, and to minimize the
difficulty of updating the graph structure whenever new links are found.
Suppose a new Mersenne prime
is computed. Wouldn't it be nice simply to add a
single entry to the Exs facet of Mersenne-primes, rather than to have to update the
Exs pointers from a dozen concepts?
There is no harm if a few redundant links sneak in.
\yskip
``Isa's" is the converse of ``Examples". The format is the same, and
(if A and B are both concepts) A is an entry on B.Isa iff B is an entry on
A.Exs. In other words, A is a member of Examples(B) iff B is a member of
Isa's(A).
The \4uses\0 of the Exs/Isa's facets are similar to those for Genl/Spec
(see previous subsection), but
their formats are quite a bit more complicated
(at the
implementation level).
There are really a cluster of different facets all related to Examples:
\noindent $\bullet $\ TYPICAL: This is a list of average examples.
Care must be taken to include a wide
spectrum of allowable kinds of examples.
For ``Sets", these would include sets of varying
size, nesting, complexity, type of elements, etc.
\noindent $\bullet $\ BOUNDARY: Items which just barely pass the
definition of this concept. This might
include items which satisfy the base step of a recursive definition, or items which
were intuitively believed to be \4non\1-examples of the concept.
For ``Sets", this
might include the empty set.
\noindent $\bullet $\ BOUNDARY-NOT: Items which just
barely fail the definition. This might include an
item which had to be slightly modified during checking,
like $\{$A,B,A$\}$ becoming
$\{$A,B$\}$.
\noindent $\bullet $\ FOIBLES: Total failures.
Items which are completely against the grain of this
concept. For ``Sets", this might include the operation ``Compose".
\noindent $\bullet $\ NOT: This is the ``cache" trick
used to store the answers to frequently-asked
questions. If AM frequently wants to know whether X is an example of Y, and
the answer is \4No\1, then much time can be saved by adding X as an entry to the
Exs-not facet of Y.
\yskip
An individual item on these facets may just be a concept name, or it may be
more complicated.
In the case of an operation, it is an item of the form <$a↓1a↓2\ldots \ \→\ v$>;
i.e., actual
arguments $a↓i$ and the value $v$ returned. In the case of objects,
it is an object of
that form (e.g., Sets.Exs might contain $\{k,\,r\}$
as one entry.)
Here is a more detailed illustration. Consider the
Examples facet of Set-union. It might appear thus:
\han4{TYPICAL: $\{$A$\}\union \{$A,B$\}\ \→\ \{$A,B$\}$;}
\han7{$\{$A,B$\}\union \{$A,B$\}\ \→\ \{$A,B$\}$;}
\han7{$\{$A,<3,4,3>,$\{$A,B$\}\}\union \{$3,A$\}\
\→\ \{$A,<3,4,3>,$\{$A,B$\}$,3$\}$.}
\han4{BOUNDARY: $\{\}\union X\ \→\ X $\foo{
Actually, AM is not quite smart enough to use the variable X as shown in the
boundary examples. It would simply store a few instances of this general rule, plus
have an entry of the form <Equivalent: Identity(X) and Set-union(X,$\{\}$)>
on the
Exs facet of Conjectures.
Notice that because of the asymmetric way Set-union was defined,
only one lopsided boundary example was found. If another definition were supplied,
the converse kind of boundary examples would be found. }}
\han4{BOUNDARY-NOT: $\{$A,B$\}\union \{$A,C$\}\ \→\ \{$A,B,A,C$\}$;}
\han7{$\{$A,B,C,D$\}\union \{$E,F,G,H,I,J$\}\ \→\ \{$A,B,C,E,F,G,H,I,J$\}$}
\han4{FOIBLES: <2,A,2>}
\han4{NOT: {\it no entries}}
\yyskip
\noindent The format for Isa's are much simpler:
there are only two kinds of links, and they're each merely a
list of concept names.
Here is the Isa facet of Set-union:
\han4{ISA: (Operation\foo{ This entry is redundant. } Domain=Range-op)}
\han4{ISA-NOT: (Structure Composition Predicate)}
\yskip
\noindent At some time, some rule asked whether Set-union {\it isa}
Composition. As a result,
the negative response was recorded by adding ``Composition" to the Isa-not
facet of Set-union, and adding ``Set-union" to the Exs-not subfacet of the
Examples facet of the concept Composition
(indicating that Set-union was definitely not an
example of Composition, yet there was no reason to consider it a foible).
\SSSEC{In-Domain-of/In-Range-of}
We shall say that A is in the domain of B (written ``A In-dom-of B")
iff
\noindent $\bullet \ $ A and B are concepts
\noindent $\bullet \ $ B isa Operation
\noindent $\bullet \ $ A is equal to (or at least a
specialization of) one of the domain components of the operation B. That is,
B can be executed using any example of A as one of its arguments.
\yskip
Although it can be recomputed very easily, we may wish to record the
fact that A In-dom-of B by adding the entry ``B" to the In-dom-of
facet of A. AM may even wish to add this new entry to the
Domain/range facet of B (where A is a specialization of the $j↑{th}$
domain component of B):
\noindent
<$D↓1\ D↓2\ldots \ D↓{j-1} \ A \ D↓{j+1} \ldots \ D↓i \ \→\ R$>.
The semantic content of ``In-dom-of" is: what can be done to
any example of
a given
concept C? Given an example of concept C, what operations can be
run on that thing? E.g.,
``Primes In-dom-of Squaring" tells us that we can apply the operation
Squaring to any particular prime number we wish.
\yskip
Let us now turn from ``In-dom-of" to the related facet ``In-ran-of."
We say that concept A is in the range of B iff B is an Activity
and A is (a specialization of) the range of B.
For example, Odd-perfect-squares is In-ran-of Squaring, since
(i) Squaring is an operation, hence an Activity, and (ii)
one of its Domain/range
entries is <Numbers $\ \→\ $ Perf-squares>, and
Perf-squares is a generalization of Odd-perfect-squares\foo{ Why? Because
Generalizations(Odd-perfect-squares) is the set of concepts
$\{$Odd-numbers Perf-squares Numbers Objects Any-concept Anything$\}$,
hence contains Perf-squares.
}.
Here is what the In-ran-of facet of Odd-perfect-squares might look
like:
\han5{(Squaring Add TIMES Maximum Minimum Cubing)}
\noindent
Each of these operations will --- at least
sometimes --- produce an odd perfect square
as its result.
Semantically, the In-ran-of relation between A and B means that one
might be able to produce examples of
A by running operation B.
Aha!
This is a potential mechanism for finding examples of a concept A.
All you need do is get hold of In-ran-of(A), and run each of those
operations. Even more expeditious is to examine the Examples facets
of each of those operations, for already-run examples whose values
should be tested using A.Defn, to see if they are examples of A's.
AM relies on this in times of high motivation; it is too ``blind" a
method to use heavily all the time.
This facet is also useful for generating situations to investigate.
Suppose that the Domain/range facet of Doubling contains only one
entry: < Numbers $\ \→\ $Numbers >. Then syntactically, Odd-numbers is in the
range of Doubling. Eventually a heuristic rule may have AM spend some
time looking for an example of Doubling, where the result was an odd
number. If none is quickly found, AM conjectures that it \4never\0 will be
found. Since one definition of Odd-number(x) is ``Number(x) and
Not(Even-number(x))", the only non-odd numbers are even numbers. So
AM will increment the Domain/range facet of Doubling with the entry
<Numbers$\ \→\ $Even-numbers>, and remove the old entry. Thus Odd-numbers
will no longer be In-dom-of Doubling. AM can of course chance upon
this conjecture in a more positive way, by noticing that all known
examples of Doubling have results which are examples of Even-numbers.\foo{ This
positive approach is in fact the way AM noticed this particular relationship. }.
A more productive result is suggested by examining the cases where
Odd-perfect-squares are the result of cubing. The smallest such odd
numbers are 1, 729, and 15625. In general, these numbers are all those of
the form ${(2n+1)}↑6$.{\footnote { }{$↑6$ Wrong.
That was an exponent, not a footnote!}}
How could AM notice such an awkward relationship?
The general question to ask, when A In-ran-of B,
is ``What is the set of domain items whose values (under the operation
B) are A's?`` In case the answer is ``All" or ``None", some special
modifications can be made to the Domain/range facets and In-dom-of,
In-ran-of facets of various concepts, and a new conjecture can be
printed. In other cases, a new concept might get created,
representing precisely the set of all arguments to B which yield
values in A. If you will, this is the inverse image of A, under
operation B. In the case of B a predicate, this might be the set of
all arguments which satisfy the predicate.
In the case of B=Cubing
and A=Odd-perfect-squares, the heuristic mentioned above will have
AM create a new concept:
the inverse image of Odd-perfect-squares under the operation of Cubing.
That is, find numbers whose cubes are Odd-perfect-squares.
It is quickly noticed that such numbers are precisely the set of
Odd-perfect-squares themselves! So
The Domain/range facet of Cubing might get this new entry:
<Odd-perfect-squares $\ \→\ $Odd-perfect-squares>.
But not all squares can be reached by cubing, only a few of them can.
AM will notice this, and
the new range would then be isolated and might be renamed
by the user ``Perfect-sixth-powers". Note that all this
was brought on by examining the In-ran-of facet of
Odd-perfect-squares. ``Cubing" was just one of the seven
entries there. There are six more stories to tell in this
tiny nook of AM's activities.
How exactly does AM go about gathering the In-ran-of and In-dom-of
lists? Given a concept C, AM can scan down the global tree of operations
(the Exs and Spec links below the concept `Active').
For if C is not In-dom-of F, it certainly won't be In-dom-of any
specialization of F. Similarly, if it can't {\sl legally } be
produced by F, it
won't be produced by any specialization of F: if you can't get x
using Doubling you'll never get it by Doubling-perfect-numbers.
So AM simply ripples
around, as usual. The precise code for this algorithm is of little
interest. There are not that many operations, and it is cheap to
tell whether X is a specialization of a given concept, so even an
exhaustive search wouldn't be prohibitive. Finally, recall that such
a search is not done all the time. It will be done initially,
perhaps, but after that the In-dom-of and In-ran-of networks will
only need slight updating now and then.
\SSSEC{Views}
Often, two concepts
A and B will be inequivalent, yet there will be a ``natural"
bijection between one and (a subset of) the other.
For example, consider a finite set S of atoms, and consider the
set of all its subsets, 2$↑S$,
also called the \4power set\0 of S, $\Pscr $(S).
Now S is a member of, but not a \4subset\0 of, 2$↑S$
(e.g., if S=$\{x,y,\ldots\}$, then $x$ is not a member of 2$↑S$).
On the other hand, we can identify or view
S as a subset by the mapping $v\ →\ \{v\}$.
Then S is associated with the following
subset of 2$↑S$: $\{ \{x\}, \{y\},\ldots \}$.
Why would we want to do this? Well, it shows
that S is identified with a \4proper\0 subset of 2$↑S$, and indicates that S has
a lower cardinality (remember: all sets are finite).
As another example, most of us would agree that the set $\{x, \{y\}, z\}$ can be
associated with the following bag: $(x, \{y\}, z)$.
Each of them can be viewed as
the other. Sometimes such a viewing is not
perfectly natural, or isn't
really a bijection: how could
the bag (2, 2, 3) be viewed as a set? Is $\{2,3\}$ better or worse than
$\{2,\{2\},3\}$?
The View facet of a concept C describes how to view instances of another concept D
as if they were C's. For example, this entry on the View facet
of Sets explains how to view any given structure as if it were a Set:
\yskip
\noindent \6Structure: \it $\lambda $ (x) Enclose-in-braces(Sort(Remove-multiple-elements(x)))\1
\yskip
\noindent If given the list <z,a,c,a>, this little program would remove
multiple elements (leaving <z,a,c>), sort the structure
(making it <a,c,z>), and replace the ``<$\ldots $>"
by ``$\{\ldots \}$", leaving the
final value as $\{$a,c,z$\}$. Note that this transformation is not 1-1 (injective);
the list <a,c,z> would get transformed into this same set.
On the other hand, it may be more useful than transforming
the original list into $\{$z,$\{$a,$\{$c,$\{$a$\}\}\}\}$
which retains the ordering and
multiple element information.
Both of those transformations may be present as entries on the View
facet of Sets.
As it turns out, the View facet of Sets actually contains only the following
information:
\yskip
\noindent \6Structure: \it $\lambda $ (x) Enclose-in-braces(x) \1
\yskip
\noindent Thus the Viewing will produce entities which are not quite sets.
Eventually,
AM will get around to executing a task of the form ``Check Examples of Sets",
and at that time the error will be corrected.
One generalization of Sets is No-multiple-elements-Structures, and one of its
entries under Examples.Check says to remove all multiple elements. Similarly,
Unordered-structures is a generalization of Sets, and one of its Examples.Check
subfacet entries
says to sort the structure. If either of these alters the structure, the old structure
is added to the Boundary-not subfacet (the `Just-barely-miss' kind) of
Examples facet of Sets.
The syntax of the View facet
of a concept C
is a list of entries; each entry specifies the name of
a concept, X, and a little program P. If it is desired to view an instance of
X as if it were a C, then program P is run on that X; the result is (hopefully)
a C. The programs P are opaque to AM; they must have no side effects and be quick.
Here is an entry on the View facet of Singleton:
\yskip
\noindent \6Anything: \it $\lambda $ (x) Set-insert(x, $\phi $) \1
\yskip
\noindent In other words, to view anything as a singleton set,
just insert it into the
empty set. Note that this is also one way to view {\sl anything} as a set.
\yskip
As you've
no doubt guessed, there is a general formula explaining this:
\yskip
\han5{$Views(X) =↓{df} View\biglp Specializations(X)\bigrp $}
\yskip
\noindent Thus,
to find all the ways of viewing something as a C, AM ripples away from C
in the Spec direction, gathering all the View facets along the way.
All of their entries are valid entries for C.View as well.
\yskip
In addition to these built-in ways of using the Views facets, some special
uses are made in individual heuristic rules.
Here is a heuristic rule which employs the Viewing facets of relevant concepts in
order to find some examples of a given concept C:
\yskip
\han2{{\it IF the current task is to Fill-in Examples of C,}}
\han3{{\it and C has some entries on its View facet,}}
\han3{{\it and one of those entries <X,P> indicates a concept
X which has some known Examples,}}
\han2{{\it THEN run the associated program P on each member of Examples(X),}}
\han3{{\it and add the following task to the agenda: ``Check Examples of C", for the
following reason: ``Some very risky techniques were used to find examples of C",
and that reason's rating is computed as:
Average(Worth(X), $\|$ the examples of C found in this manner$\|$).}}
\yskip
\noindent Say the task selected from the agenda
was ``Fill-in Examples of Sets".
We saw that one entry on Sets.View was \6Structure:
$\lambda $(x) Enclose-in-braces(x)\1.
Thus it is of the form <X,P>, with X=Structure. The above heuristic rule will
trigger
if any examples of Structures are known. The rule will then use the
View facet of Sets to find some examples of Sets.
So AM will go off, gathering all the examples of structures.
Since Lists is a Specialization of
Structure, the computation of Examples(Structures) will eventually ripple
downwards and ask for Examples of Lists. If the Examples facet of Lists
contains the entry <z,a,c,a,a>, then this will be retrieved
as one of the members of Examples(Structure). The heuristic rule takes each such
member in turn, and
feeds it to Set.View's
little program P.
In this case, the program replaces the list brackets with set braces, thus converting
<z,a,c,a,a> to $\{$z,a,c,a,a$\}$.
In this manner, all the existing structures will be converted
into sets, to provide examples of sets.
After all such conversions take place, a great number
of potential examples of Sets will exist. The final action of the right side of the
above heuristic rule is to add the new task ``\6Check examples of Sets\1'' to the
agenda. When this gets selected, all the ``slightly wrong" examples will be fixed
up. For example, $\{$z,a,c,a,a$\}$
will be converted to $\{$a,c,z$\}$.
If any reliance is made on those unchecked examples, there is the danger of
incorrectly rejecting a valid conjecture. This is not too serious, since the
very first such reliance will boost the priority of the task
``\6Check examples of Sets\1'', and it would then probably be the very next task
chosen.
\SSSEC{Intuitions}
\qq{The mathematician does not work like a machine; we cannot overemphasize
the fundamental role played in his research by a special intuition
(frequently wrong), which is not common-sense, but rather a divination
of the regular behavior he expects of mathematical beings.}{Bourbaki}
This facet turned out to be a ``dud", and was later excised from all
the concepts. It will be described below anyway, for the benefit of
future researchers. Feel free to skip directly to the next
subsection.
The initial idea was to have a set of a few (3--10) large, global,
opaque LISP functions. Each of these functions would be termed an
``Intuition" and would have some suggestive name like ``jigsaw-puzzle",
``see-saw", ``archery", etc. Each function would somehow model the
particular activity implied by its name. There would be a multitude
of parameters which could be specified by the ``caller" as if they
were the arguments of the function. The function would then work to
fill in values for any unspecified parameters. That's all the
function does. The caller would also have to specify which
parameters were to be considered as the ``results" of the function.
For the see-saw, the caller might provide the weight of the
left-hand-side sitter, and the final position of the see-saw, and ask
for the weight of the right-hand sitter. The function would then
compute that weight (as any random number greater/less-than the
left-hand weight, depending on the desired tilt of the board). Or,
the caller might specify the two weights and ask for the final
position.
The See-saw function is an expert on this subject; it has
efficient code for computing any values which can be computed, and
for randomly instantiating any variables which may take on any value
(e.g., the first names of the people doing the sitting). When an individual
call is made on this function, the caller is not told how the final values
of the variables were computed, only what those values end up as.
So the Intuitions were to be experimental laboratories for AM,
wherein it could get some (simulated) real-world empirical data. If
the seesaw were the Intuition for ``>", and ``weight" corresponded to
``Numbers", then several relationships might be visualized intuitively
(like the anti-symmetry of ``>").
This is a nice idea, but in practice
the only relationships derived in this way were the ones that were
thought up while trying to encode the Intuition functions. This
shameful behavior led to the excision of the Intuitions facets
completely from the system.
As another example, suppose AM is considering composing two relations
R and S. If they have no common Intuition reference, then perhaps
they're not meaningfully composable. If they do both tie into the
same Intuition function, then perhaps that function can tell us
something about the composition.
This is a nice idea, but in practice
very few prunings were accomplished this way, and no unanticipated
combinations were fused.
\yskip
Each Intuition entry is like a ``way in" to one of the few global
scenarios. It can be characterized as follows:
\yskip
\noindent $\bullet \ $ One of the salient features of these
entries --- and of the
scenarios --- is that AM is absolutely forbidden to look inside them,
to try to analyze them. They are {\it opaque}\1. Most Intuition
functions use numbers and arithmetic, and it would be pointless to
say that AM discovered such concepts if it had access to those
algorithms all along.
\noindent $\bullet \ $ The second characteristic of an Intuition is that it be
{\it fallible}\1. As with human intuition, there is no guarantee
that what is suggested will be verified even empirically, let alone
formally. Not only does this make the programming of Intuition
functions easier, it was meant to provide a degree of ``fairness" to
them. AM wasn't cheating quite as much if the See-saw function was
only antisymmetric 90\%\ of the time.
\noindent $\bullet \ $ Nevertheless, the intuitions are very
{\it suggestive}\1. Many
conjectures can be proposed only via Intuitions. Some analogies (see the
next subsection) can also be suggested via common intuitions.
\yyskip
After they were coded and running, I decided that the intuition
functions were unfair; they contained some major discoveries
``built-in" to them. They had the power to propose
otherwise-obscure new concepts and potential relationships.
They contributed nothing other than what was originally programmed into them;
\4they were not synergetic\1.
Due to this dubious character of
the contributions by AM's few Intuition functions,
they were removed from the system. All the examples and all the
discoveries listed in this document were made without their
assistance.
We shall now drop this de-implemented idea. I think there is some
real opportunity for research here.
For the benefit of any future researchers in this area,
let me point to the excellent
discussion of analogic representations in [Sloman 71].
\SSSEC{Analogies}
\qq{The
whole idea of analogy is that `Effects', viewed as a function of situation,
is a {\sl continuous} function.}{Poincar\'e}
As with Views and Intuitions, this facet is useful for shifting
between one part of the universe and another. Views dealt with
transformations between two specific concepts; Intuitions dealt with
transformations between a bunch of concepts and a large standard
scenario which was carefully hand-crafted in advance. In contrast,
this facet deals with transforming between a list of concepts and
another list of concepts.
Analogies operate on a much grander scale than Views. Rather than
simply transforming a few isolated items, they initiate the creation
of many new concepts. Unlike Intuitions, they are not limited in
scope beforehand, nor are they opaque. They are dynamically proposed.
The concept of ``prime numbers" is \4analogous\0 to the notion of
``simple groups".
While not isomorphic, you might guess at a few relationships
involving simple groups just by my telling you this fact:
simple groups are to groups what primes are to numbers.\foo{
If a group is not simple, it can be factored.
Unfortunately, the factorization of a group into simple groups is not unique.
Another analogizing contact: For each prime p, we can
associate the cyclic group of order p, which is of course simple.
AM never came up with the concept of simple groups; this is just an illustration
for the reader. }
Let's take 3 elementary examples, involving very fundamental
concepts:
\noindent $\bullet \ $ AM was told how to {\it View} a set as if it were a bag.
\noindent $\bullet \ $ AM was told it could {\it Intuit} the relation ``$≥$''
as the predetermined ``See-saw"
function.
\noindent $\bullet \ $ AM, by itself, once {\it Analogize}d that these two
lists correspond:
<Bags, Same-length, Operations $\Fscr ↓i$ on&into Bags>
<Bags-of-T's, Equality, Those $\Fscr ↓i$ restricted to Bags-of-T's>
\yskip
\noindent The
concept of a bag, all of whose elements are ``T"'s, is the unary
representation of \4numbers\0 discovered by AM. When the above
analogy (third one)
is first proposed, there are many known Bag-operations\foo{
i.e., all entries on In-dom-of(Bag) and In-ran-of(Bag); a few of
these are: Bag-insert, Bag-union, Bag-intersection}, but there are as
yet no numeric operations\foo{ Examples of Operation whose domain/range
contains ``Number". }. This triggers one of AM's heuristic rules, which
spurs AM on to finding the analogues of specific Bag-operations. That
is, what special properties do the bag-operations have when their
domains and/or ranges are restricted from Bags to Bags-of-T's (i.e,
Numbers). In this way, in fact, AM discovers Addition (by
restricting Bag-union to the Domain/range <Bags-of-T's Bags-of-T's $\ \→\ $
Bags-of-T's>), plus many other nice arithmetic functions.
Well, if it leads to the discovery of Addition, that analogy is
certainly worth having. How would an analogy like that be proposed?
As the reader might expect by now, the mechanism
is simply some heuristic rule adding it as an entry to
the Analogies facet of a certain concept. For example:
\yskip
\han3{{\it IF the current task has just created a canonical specialization C2 of
concept C1, with respect to operations F1 and F2, [i.e., two members
of C2 satisfy F1 iff they satisfy F2],}}
\han3{{\it THEN add the following entry to the Analogies facet of C2:}}
\han4{{\it <C1,\ F1,\ Operations-on-and-into(C1)>}}
\han4{{\it <C2,\ F2,\ Those operations restricted to C2's>}}
\yskip
After generalizing ``Equality" into the operation ``Same-length", AM
seeks to find a canonical\foo{ A natural, standard form. All bags
differing in only ``unimportant" ways should be transformed into the
same canonical form.
Two bags B1 and B2 which have the same length should get transformed into
the same canonical bag.
} representation for Bags. That is, AM seeks a
canonizing function f, such that (for any two bags x,y)
$ Same-length(x,y) \↔ Equal( f(x), f(y) ).$\0
Then the range of f would delineate the set of ``canonical" Bags. AM
finds such an f and such a set of canonical bags: the operation f
involves replacing each element of a bag by ``T", and the canonical
bags are those whose elements are all T's. In this case, the above
rule triggers, with C1=Bags, C2=Bags-of-T's, F1=Same-length,
F2=Equality, and the analogy which is produced is the one shown as
the third example above.
The Analogy facets are not implemented in full generality in the
existing LISP version of AM, and for that reason I shall refrain from
delving deeper into their format. Since good research has already
been done on reasoning by analogy\foo{ An excellent discussion of
reasoning by analogy is found in [P\'olya 54]. Some early work on emulating
this was reported in [Evans 68]; a more recent thesis on this topic is
[Kling 71]. }, I did not view it as a central feature of my work. Very
little space will be devoted to it in this document.
An important type of analogy which was untapped by AM was that
between heuristics. If two situations were similar, then conceivably the
heuristics useful in one situation might be useful (or have useful
analogues) in the new situation. Perhaps this is a viable way of
enlarging the known heuristics. Such ``meta-level" activities were
kept to a minimum throughout AM, and this
proved to be a serious limitation.
Let me stress that the failure of the Intuitions facets to be
nontrivial was due to the lack of spontaneity which they possessed.
Analogies facets were useful and ``fair" since their uses were not
predetermined by the author.
\SSSEC{Conjec's}
The ``Conjec" facet of a concept C is a list of
relationships which involve C.
There are several uses for C.Conjec:
\noindent $\bullet \ $ Store awkwardly-phrased conjectures: this wasn't really useful,
since most conjectures fell out naturally as simple relationships,
expressable, e.g., as a single Genl arc, or a single Isa arc.
\noindent $\bullet \ $ Store flimsy conjectures: apparent
relationships worth remembering but not quite believed yet.
\noindent $\bullet \ $ Hold
heuristics which notice and check conjectures.
\noindent $\bullet \ $ Obviate the need for many nearly-indistinguishable concepts:
Collapse the entire essence of a concept like ``Odd-primes"
into one or two relationships
involving ``Primes"; then discard ``Odd-primes".
\noindent $\bullet \ $ Untangling paradoxes: a historic record, to facilitate
backtracking in case of catastrophe, which (luckily!) wasn't ever needed.
\noindent $\bullet \ $ Improve existing algorithms (e.g., once you know primes are odd,
hunt only through odd numbers), improve
testing procedures, representations, etc.
\noindent $\bullet \ $ Display AM's most impressive observed relationships in a form which is easily
inspectable by the user.
\yskip
The syntax of this facet is simply a list of conjectures, where each conjecture
has the form of a relationship: (R a b c$\ldots $d). R is the name of a known operation
(in which case, abc$\ldots $ are its arguments and we claim that d is its value).
For instance, ``\it (Same-size Insert(S,S) S False)\1" is a conjecture
that
inserting a set into itself
will always give you a set of a different length.
This conjecture happens to be true for finite sets.
\SSSEC{Definitions}
A typical way to disambiguate a concept from all others is to provide
a ``definition" for it.\foo{ As EPAM studies showed
[Feigenbaum 63],
one can
never be sure that this definition will specify the concept uniquely
for all time. In the distant future, some new concept may differ in
ways thought to be ignorable at the present time. } Almost every
concept had some entries initially supplied on its ``Definitions"
facet. The format of this facet is a list of entries, each one
describing a separate definition. A single entry will have the
following parts:
\yskip
\noindent $\bullet \ $ Descriptors: Recursive/Linear/Iterative, Quick/Slow,
Opaque/Transparent, Once-only/Early/Late, Destructive/Nondestructive.
\noindent $\bullet \ $ Relators: Reducing to the definition of concept X, Same as Y
except$\ldots $, Specialized version of Z, Using the definition of W, etc.
\noindent $\bullet \ $ Predicate: A small, executable piece of LISP code, to tell if any
given item is an example of this concept.
\yskip
The semantics of this are that
the predicate or ``code" part of the entry (i)
must be faithfully described by the
Descriptors, and (ii) must be related to other concepts just as the Relators
claim.
Here is a typical entry from the Definitions facet of the Set-union concept:
\yskip
\han3{{\6 Descriptors: {\it Slow, Recursive, Transparent }}}
\han3{{\6 Relators: {\it Uses the algorithm for Set-insert, Uses the definition of Empty-set,
Uses the definition of Set-equal, Uses the algorithm for Some-member,
Uses the algorithm for Set-delete, Uses the definition of Set-union }}}
\han3{{\6 Code: {\it $\lambda $ (A B C) }}}
\han5{{\bf IF \it Empty-set.\ Defn(A)}}
\han5{{\bf THEN \it Set-equal.\ Defn(B,C) }}
\han5{{\bf ELSE }}
\han6{{\it X $←$ Some-member.\ Alg(A)\foo{The
notation ``X $←$ Some-member.Alg(A)" means that any one algorithm
for the concept Some-member should be accessed, and then it should be
run on the argument A. The result, which will be an element of A, is
to be assigned the name ``X". The effect is to bind the variable X to
some member of set A.} }}
\han6{{\it A $←$ Set-delete.\ Alg(X,A) }}
\han6{{\it B $←$ Set-insert.\ Alg(X,B) }}
\han6{{\it Set-union.\ Defn(A,B,C) }}
\yskip
\noindent Let me stress that this is just one entry from one facet of one
concept.
This particular definition is not very efficient, but it is described
as Trans\-par\-ent. That means it is very well suited to analysis and
modification by AM itself. Suppose some heuristic rule wants to
generalize this definition. It can peer inside it, and, e.g., replace
the base step call on Set-equal, by a call on a generalization of
Set-equal (say ``Same-length"\foo{ For disjoint sets, the new definition
would specify the operation which we call ``addition". }).
How could \4different\0 definitions help here? Suppose there were a
definition which first checked to see if the three arguments were
Set-equal to each other, and if so then it instantly returned T as
the value of the definition predicate; otherwise, it recurred into
Set-union.Defn again. This might be a good algorithm to try at the
very beginning, but if the Equality test fails, we don't want to keep
recurring into this definition. This algorithm should thus have a
Descriptor labelling it ONCE-ONLY EARLY.
There are three purposes to having Descriptors and Relators hanging
around:
\noindent 1. For the benefit of the user. AM appears more intelligent because
it can \4describe\0 the kind of definition it is using --- and why.
\noindent 2. For the sake of efficiency. When all AM wants to do is to evaluate
Set-union(A,B), it's best just to grab a \4fast\0 definition. When trying to
generalize Set-union, it's more appropriate to
modify a very clean, transparent definition --- even if it is a slow one.
\noindent 3. For the benefit of the heuristic rules. Often, a left- or a
right-hand-side will ask about a certain kind of definition. For
example, \it ``If a transparent definition of X exists, then try to
specialize X"\1.
\yskip
Let me pull back the curtain a little further, and expose the actual
implementation of these ideas in AM. The secrets about to be
revealed will not be acknowledged anywhere else in this document.
They may, however, be of interest to future researchers. Each
concept may have a cluster of Definition facets, just as it can have
several kinds of Examples facets. These include three types:
Necessary and sufficient definitions, necessary definitions, and
sufficient definitions. These three types have the usual
mathematical meanings. All that has been alluded to before (and
after this subsection) is the necc&suff type of definition (x is an
example of C \4if and only if\0 x satisfies C.Def/necc&suff). Often,
however, there will be a much quicker sufficient definition (x
satisfies C.Def/suf, \4only if\0 x is certainly a C). Similarly,
entries on C.Def/nec are useful for quickly checking that x is
\4not\0 an example of C (to check this, it suffices to verify that x
\4fails\0 to satisfy a necessary definition of C).
So given the task of deciding whether or not x is an example of C, we
have many alternatives:
(i) If x is a concept, see if C is a member of x.ISA (if so, then x is
an example of C).
(ii) Try to locate x within C.Exs. (depending upon the flavor of subfacet
on which x is found, this may show that x is or is not
an example of C).
(iii) If x is a concept, ripple to collect ISA's(x), and see if C is a
member of ISA's(x).
(iv) If there is a fast sufficent definition of C, see if x satisfies
it.
(v) If there is a fast necessary definition of C, see if x fails it
(if so, then x is \4not\0 an example of C).
(vi) If there is a necessary and sufficient definition of C, see whether
or not x satisfies that definition (this may show that x is or is not
an example of C).
(vii) Try to locate x within C.Exs. (depending upon the flavor of subfacet
on which x is found, this may show that x is or is not
an example of C).
(viii) Recur: check to see if x is an example of any specialization of
C.
(ix) Recur: check to see if x is \4not\0 an example of some
generalization of C (if so, then x is \4not\0 an example of C).
\yskip
\noindent In fact, there is a LISP function, IS-EXAMPLE, which performs those
nine steps in that order. At each moment, there is a timer set, so even
if there is a necessary and sufficient definition hanging around, it
might run out of time before settling the issue one way or the other.
Each time the function recurs, the timer is granted a smaller and
smaller quantum, until finally it has too little to bother recurring
anymore. There is a potential overlap of activity: to see if x is an
example of C, the function might ask whether x is or is not an
example of a particular generalization of C (step (ix), above); to test
\4that\1, AM might get to step (viii), and again ask if x is an example of
C. Even though the timer would eventually terminate this fiasco (and
even though the true answer might be found despite this wasted
effort) it is not overly smart of AM to fall into this loop.
Therefore, a stack is maintained, of all concepts whose definitions the
IS-EXAMPLE function tried to test on argument x. As the function
recurs, it adds the current value of C to that stack; this value
gets removed when the recursion pops back to this level, when that recursive call
``returns" a value.
\SSSEC{Algorithms}
Earlier, we said that each concept can have any facets from the universal fixed set
of 25 facets.
This is not strictly true. Sometimes, a whole class
of concepts will possess a certain type of facet which no others may meaningfully
have.
That is, there will be a \4domain of applicability\0 for the facet, just
as we defined such domains of applicability for heuristics.
For example, consider the ``Domain/Range" facet. It is meaningful only to
``operations".
Let's view this in a more general light.
For each facet f, the only concepts which
can have entries for facet f are examples of some particular concept $Dscr $(f).
$Dscr $(f) is the domain of applicability of facet f.
If C is any concept which is not an example of $Dscr $(f), then it can never
meaningfully possess any entries for that facet f.
For almost all facets f, $Dscr $(f) is ``Any-concept". Thus any concept can possess
almost any facet.
For example, $Dscr $(Defn)=``Any-concept", so any concept may have definitions.
But $Dscr $(Domain/range)=``Operation". So only operations can have domain/range facets.
Similarly, $Dscr $(Algorithms)=``Actives". This facet is the subject of this section.
The Algorithms facet is present for all --- but only
for --- Actives
(predicates, relations, and operations).
The representation is, just as with Definitions,
a list of entries, each one
describing a separate algorithm, and each one having three parts:
Descriptors, Relators, Program.
Instead of a LISP predicate, however, the Algorithms facets possess a LISP function
(an executable piece of code whose value
will in general be other than True/False).
All the details about understanding the descriptors and relators are embedded in the
fine structure of the heuristic rules. A left-hand-side may test whether a
certain kind of algorithm exists for a given concept. A right-hand-side which
fills in a new algorithm must also worry about filling
in the appropriate descriptors
and relators. As with newly created concepts, such information is trivial to fill in
at the time of creation, but becomes much harder after the fact.
Here is a typical heuristic rule which results in a new entry being added to the
Algorithms facet of the newly-created concept named
Compose-Set-Intersect\-&\-Set-Intersect\foo { The
operation ``$ \inter \circ \inter $" takes
three sets and intersects them}:
\han2{{\it IF the task is to Fillin Algorithms for F, }}
\han3{{\it and F is an example of Composition}}
\han3{{\it and F has a definition of the form $ F =↓{df} G\circ H$,}}
\han3{{\it and F has no transparent, nonrecursive algorithm,}}
\han2{{\it THEN add a new entry to the Algorithms facet of F,}}
\han3{{\it with Descriptors: Transparent, Non-recursive}}
\han3{{\it with Relators: Reducing to G.Alg and H.Alg,
Using the Definition of <G.Domain>}}
\han3{{\it with Program:}}
\han5{{\it $\lambda (\|<G.\,Domain>\|,\,\|<H.\,Domain>\| - 1,\,X)$}}
\han6{{\it (SETQ X (H.\,Alg $\|$<G{\2.}Domain>$\|$))}}
\han6{{\it (AND }}
\han7{{\it (<G{\2.}Domain>{\2.}Defn X) }}
\han7{{\it (G{\2.}Alg X $\|$<H{\2.}Domain>$\| - 1$))}}
\yskip
\noindent The intent of the little program which gets created
is to apply the first operator, check that the
result is in the domain of the second, and then apply the second operator.
The expression $\|<G.\,Domain>\|$ means find a domain/range entry for G,
count how many domain components there are, and form a list that long
from randomly-chosen variable names {\it (u,v,w,x,y,z).}
For the case mentioned above, F = Compose-Set-Intersect&Set-Intersect,
G = Set-Intersect,
and H = Set-Intersect. The domain of
G is a pair of Sets, so
\noindent $\|<G.\,Domain>\|$ is a list
of 2 variables, say ``{\it (u v).}" Similarly,
$\|<H.\,Domain>\| - 1$\
is a list
of 1 variable, say ``{\it (w)}." Putting all this together, we see that
the new definition entry created for
Compose-Set-Intersect&Set-Intersect
would look like this:
\yskip
\han3{{\6 Descriptors: {\it Non-Recursive, Transparent }}}
\han3{{\6 Relators: {\it Reducing to Set-Intersect{\2.}Alg, Using the definition of Sets }}}
\han3{{\6 Code: {\it $\lambda $ (u,v,w,X) }}}
\han6{{\it (SETQ X (Set-Intersect{\2.}Alg u v)) }}
\han6{{\it (AND }}
\han7{{\it (Sets{\2.}Defn X) }}
\han7{{\it (Set-Intersect{\2.}Alg X w) }}
\yyskip
At times, AM will
be capable of producing only a slow algorithm for some new concept C.
For example, TIMES\inv (x) was
originally defined by AM as a blind, exhaustive search
for bags of numbers whose product is x. As AM uses that algorithm more and more,
AM records how slow it is. Eventually, a task is selected of the form
\6``Fillin new algorithms for C"\1, with the two reasons being that the existing
algorithms are all too slow, and they are used frequently.
At this point, AM should draw on a body of rules which take a declarative
definition and transform it into an efficient algorithm, or which take an
inefficient algorithm and speed it up. Doing a good job on just those rules
would be a mammoth undertaking, and the author decided to omit them.
The reader who wishes to know more about rules for creating
and improving LISP algorithms is directed to [Darlington and Burstall 73].
A more general discussion of the principles involved can be found in [Simon 72].
\SSSEC{Domain/Range}
Another facet possessed only by active concepts is Domain/Range.
The syntax of this facet is quite simple. It is a list of entries, each of the form
\noindent \6< $D↓1 D↓2 \ldots \ \→\ R$ >\1, where
there can be any number of $D↓i$'s
preceding the arrow,
and $R$ and all the $D↓i$'s are the names of concepts.
Semantically, this entry means that the active concept may be run on
a list of arguments
where the first one is an example of $D↓1$, the second an example of $D↓2$,
etc.,
and in that case will return a value
guaranteed to be an example of $R$.
In other words, the concept may be considered a relation on the
Cartesian product
$D↓1\times D↓2\times \cdots \ \times R$.
We shall say that the \4domain\0 of the concept is
$D↓1\times D↓2\times \cdots \,$, and that its \4range\0 is $R$. Each $D↓i$
is called a
\4component\0 of the domain.
For example, here is what the Domain/Range facet of TIMES might look like:
\han2{$\{$}
\han3{{\it < Numbers Numbers $\ \→\ $ Numbers > }}
\han3{{\it < Odd-numbers Odd-numbers $\ \→\ $ Odd-numbers > }}
\han3{{\it < Even-Numbers Even-Numbers $\ \→\ $ Even-numbers > }}
\han3{{\it < Odd-numbers Even-Numbers $\ \→\ $ Even-Numbers > }}
\han3{{\it < Perf-Squares Perf-Squares $\ \→\ $ Perf-Squares > }}
\han3{{\it < Bags-of-Numbers $\ \→\ $ Numbers > }}
\han2{$\}$}
\yskip
The Domain/Range part is useful for pruning away absurd compositions, and
for syntactically suggesting compositions and ``coalescings".
Let's see what this means.
Suppose some rule sometime tried to compose TIMES$\circ $Set-union.
A rule tacked onto Compose says to ensure that the range of Set-union
at least intersects
(and preferably is \4equal\0 to)
some component of the domain of TIMES. But there
are no entities which are both sets and numbers;
ergo this fails
almost instantaneously.
The claim was also made that Domain/Range facets help propose plausible coalescings.
By ``\4coalescing\1'' an operation,
we mean defining a new one, which differs from the original one in that
a couple of the arguments must now coincide. For example,
coalescing TIMES(x,y) results in the new operation F(x) defined as TIMES(x,x).
Syntactically, we can coalesce a pair of domain components of the domain/range
facet of an operation if those two domain components are equal, or if
one of them is a specialization of the other, or even if they merely intersect.
In the case of one related to the other by specialization, the more specialized
concept will replace both of them, In case of merely intersecting, an extra test
will have to be inserted into the definition of the new coalesced operation.
Given this domain/range entry for Set-insert: < Anything Sets $\ \→\ $ Sets >,
we see that
it is ripe for coalescing. Since Sets is a specialization of Anything, the new
operation F(x), which is defined as Set-insert(x,x), will have a domain/range
entry of the form < Sets $\ \→\ $ Sets >. That is, the specialized concept Sets
will replace both of the old domain elements (Anything and Sets).
F(x) takes a set x and inserts it into itself.
Thus F($\{a,b\}$) = $\{a,b,\{a,b\}\}$.
In fact, this new operation F is very exciting because it always seems to give
a new, larger set than the one you feed in as the argument.
We have seen how the Domain/range facets can
prune away meaningless coalescings, as well as meaningless
compositions. Any proposed composition or coalescing will at least be syntactically
meaningful. If all compositions are proposed only for at least one good
semantic reason, then those passing the domain/range test, and hence
those which ultimately get created, will all be valuable new concepts.
Since almost all coalescings are semantically interesting, \4any\0 of them which
have a valid Domain/Range entry will get created and probably will be interesting.
This facet is occasionally used to suggest conjectures to investigate. For example,
a heuristic rule says that if the domain/range entries have the form
<$D\, D \,D \,\ldots \ \→\ genl(D)$\foo{ ``{it genl(D)}" means:
any generalization of concept D. }>,
then it's worthwhile seeing whether the value of this
operation doesn't really always lie inside $D$ itself.
This is used right after the
Bags$\↔$Numbers analogy is found,
in the following way. One of the Bag-operations
known already is Bag-union. The analogy causes AM to consider a new operation,
with the same algorithm as Bag-union, but restricted to Bags-of-T's
(numbers in unary
representation). The Domain/range facet
of this new, restricted mutation of Bag-union
contains only this entry:
<Bags-of-T's Bags-of-T's $\ \→\ $ Bags>.
Since Bags is a generalization of Bags-of-T's, the
heuristic mentioned above triggers,
and AM sees whether or not the union of two Bags-of-T's is
always a bag containing only T's. It appears to be so, even in extreme cases, so the
old Domain/range entry is replaced by this new one:
<Bags-of-T's Bags-of-T's $\ \→\ $ Bags-of-T's>.
When the user asks AM to call these bags-of-T's ``numbers", this entry becomes
<Numbers Numbers $\ \→\ $ Numbers>.
In modern terms, then, the conjecture suggested was that the sum of two numbers
is always a number.
To sum up this last ability in fancy language,
we might say that
one mechanism for proposing
conjectures is the prejudicial belief in the unlikelihood of asymmetry.
In this case, it is asymmetry in the parts of a Domain/range entry that
draws attention.
Such conjecturing can be done by any action part of any heuristic rule;
the Conjec facet entries don't have a monopoly on initiating this type of activity.
\SSSEC{Worth}
\noindent How can we represent the worth of each concept? Here are some
possible suggestions:
\yskip
1. The most intelligent
(but most difficult)
solution is ``purely symbolically". That is,
an individualized description of the good and bad points of the
concept; when it is useful, when misleading, etc.
2. A simpler solution would be to ``standardize" the above symbolic
description once and for all, fixing a universal list of questions.
So each concept would have to answer the questions on this list (How
good are you at motivating new concepts?, How costly is your
definition to execute?,$\ldots $). The answers might each be symbolic;
e.g., arbitrary English phrases.
3. To simplify this scheme even more, we can assume that the answers
to each question will be numeric-valued functions (i.e, LISP code
which can be evaluated to yield a number between 0 and 1000). The
vector of numbers produced by {\it Eval}uating all these functions will
then be easy to manipulate (e.g. using dot-product, vector-product,
vector-addition, etc.), and the functions themselves may be inspected
for semantic content. Nevertheless, much content is lost in passing
from symbolic phrases to small LISP functions.
4. A slight simplification of the above would be to just store the
vector of numbers answering the fixed set of questions; i.e., don't
bother storing a bunch of programs which compute them dynamically.
5. Even simpler would be to try to assign a single ``worthwhileness"
number to each concept, in lieu of the vector of numbers. Simple
arithmetic operations could manipulate Worth values then. In some
cases, this linear ordering seems reasonable (``primes" really are
better than ``palindromes".) Yet in many cases we find concepts which
are too different to be so easily compared (e.g., ``numbers" and
``angles".)
6. The least intelligent solution is none at all: each concept is
considered equally worthwhile as any other concept. This threatens
to be combinatorial dynamite.
\yskip
\noindent As we progress along the intelligent
$===\→$ trivial dimension, we find
that the schemes get easier and easier to code, the Worth values get
easier and easier to deal with, but the amount of reliable knowledge
packed into them decreases.
Initially, scheme \6$\#$3\0 above was chosen for AM: a vector of numeric-valued
procedural
answers to a fixed set of questions. Here are those questions, the
components of the Worth vectors for each concept:
\yskip
\han3{ $\bullet $ Overall aesthetic worth. }
\han3{ $\bullet $ Overall utility. Combination of usefulness, ubiquity. }
\han3{ $\bullet $ Age. How many cycles since this concept was created? }
\han3{ $\bullet $ Life-span. Can this concept be forgotten yet? }
\han3{ $\bullet $ Cost. How much cpu time has been spent on this concept, since its
creation? }
\yskip
\noindent Notice that in general no constant number can answer one of these
questions once and for all (consider, e.g., Life-span). Each `answer'
had to be a numeric-valued LISP function.
A few questions which crop up often are not present on this list,
since they can be answered trivially using standard LISP functions
(e.g., ``How much space does concept C use up?" can be found by
calling the function ``COUNT" on the property-list of the LISP atom
``C").
Another kind of question, which was anticipated and did in fact come
up frequently, is of the form ``How good are the entries on facet F of
this concept?", for various values of F. Since there are a couple
dozen kinds of facets, this would mean adding a couple dozen more
questions to the list. The line must be drawn somewhere. If too much
of AM's time is drained by evaluating where it is already, it can
never progress.
The heuristic rules are responsible for initially setting up the
various entries on the Worth facets of new concepts, and for
periodically altering those entries for \4all\0 concepts, and for
delving into those entries when required.
Recent experiments\foo{ See Experiment 1, which is described in
subsection 6.2.2. } have shown that
there was little change in behavior when each vector of functions was
replaced by a single numeric function (actually, the sum of the
values of the components of the ``old" vector of functions). There
wasn't even too much change when this was replaced by a constant.
There \4was\0 a noticeable degradation (but no collapse) when
all the concepts' Worth numbers were set equal to each other initially.
For the purposes of this document, then (except for this page and the
discussion of Experiment 1), we may as well assume that each concept
has a single number (between 0 and 1000) attached as its overall
``Worth" rating. This number is set\foo{ The author initially sets this
value for the 115 initial concepts. Heuristic rules set it for each
concept created by AM. } and referenced and updated by heuristic
rules. Experiment 1 can be considered as showing that a more
sophisticated Worth scheme is not necessary for the particular kinds
of behaviors that AM exhibits.
\SSSEC{Interest}
Now that we know how how to judge the overall worth of the concept
``Composition", let's turn to the question of how interesting some
specific composition is. Unfortunately, the Worth facet really has
nothing to say about that problem. The Worth of the concept ``Compose"
has little effect on how interesting a particular composition is:
``Count$\circ$Divisors-of" is very interesting, and
``Insert$\circ$Member"\foo{
INSERT$\circ $MEMBER(x,y,z) is defined to be:
if x$\epsilon $y, then insert `T' into z, else insert
`NIL' into z. } is less so. The Worth facets of \4those\0 concepts
will say something about their overall value. And yet there is some
knowledge, some ``features" which would make any composition which
possessed them more interesting than a composition which lacked them:
\yskip
\noindent $\bullet $ Are the domain and range of the composition equal to each other?
\noindent $\bullet $ Are interesting properties of each component of the composition
preserved?
\noindent $\bullet $ Are undesirable properties lost (i.e., \4not\0 true about the
composition)?
\noindent $\bullet $ Is the new composition equivalent to some already-known operation?
\yskip
These hints about ``features to look for" belong tacked onto the
Composition concept, since they modify all compositions. Where and
how can this be done?
For this purpose each concept --- including ``Composition" --- can have
entries on its ``\4Interest\1'' facet. It contains a bunch of features
which (if true) would make any particular example of the current
concept interesting.
The format for the Interest facet is as follows:
\yskip \eightpoint \bf
\han4{< Conflict-matrix }
\han5{<Feature$↓1$, Value$↓1$, Reason$↓1$, Used$↓1$>}
\han5{<Feature$↓2$, Value$↓2$, Reason$↓2$, Used$↓2$>}
\han6{ $\vdots $ }
\han5{<Feature$↓k$, Value$↓k$, Reason$↓k$, Used$↓k$>}
\han4{ >}
\yskip
\tenpoint \1
\noindent This is the format of the facet itself, not of each entry. The
conflict-matrix is special and will be discussed below. Each
Feature/Value/Reason/Used quadruple will be termed an ``entry" on the
Interest facet.
Each ``Feature$↓i$'' is a LISP predicate, indicating whether or not
some interesting property is satisfied. The corresponding
``Value$↓i$'' is a numeric function for computing just how valuable
this feature is. The ``Reason$↓i$'' is a token (usually an English
phrase) which is tacked along and moved around, and can be inspected
by the user. The ``Used$↓i$" subpart is a list of all the concepts
whose definitions are known to incorporate\foo{ Not {\it satisfy} the
feature. Thus the general concept Domain=Range-op {\it incorporates}
the feature ``range(x) is one component of domain(x)" as just one of
the conjuncts in its definition. On the other hand, Set-union
{\it satisfies} the feature, since its range, Sets, really is one
component of its domain. } this feature; all examples of such
concepts will then automatically satisfy this Feature$↓i$.
For example, here is one entry from the Interest facet of Compose:
\yskip
\han3{{\6FEATURE: \it Domain(Arg1)=Range(Arg2)}}
\han3{{\6VALUE: \it $ .4 + .4\cdot Worth(Domain(Arg1)) +
.2\cdot Priority(current task)$ }}
\han3{{\6REASON: \it ``The composition of Arg1 and Arg2 will map from C back
into C"}}
\han3{{\6USED: \it Compose-with-self-Domain=Range-operation,
Interesting-compose-4}}
\yskip
Just as with Isa's and Generalizations, we can make a general
statement about Interest features:
\yskip
\han4{\6Any feature tacked onto the Interest facet of any member of
ISA's(C), also applies to C.}
\yskip
That is, X.Interest is relevant to C iff C is an example of X. For
example, any feature which makes an operation interesting, also makes
a composition interesting.
So we'd like to define the function Interests(C) as the union of the
Interest features found tacked onto any member of ISA's(C).\foo{ Recall
that the formula for this is ISA's(C) =
Generalizations(Isa(Generalizations(C))). } But some of these might
have already been conjoined to a definition, to form the concept C
(or a generalization of C). So all C's will trivially (by
definition) satisfy such features. The USED subparts can be employed
to find such features. In fact, the final value of Interests(C) is
Interest(ISA's(C)) {\it minus} all the
features whose USED subparts pointed to any member of ISA's(C).
This covers the purpose of each subpart of each entry on a typical
Interest facet. Now we're ready to motivate the presence of the
Conflict-matrices.
Often, AM will specialize a concept by conjoining onto its definition
some features which would make any example of the concept
interesting. So \4any\0 example of this new specialized concept is
thus guaranteed to be an \4interesting\0 example of the old concept.
Sometimes, however, a pair of features are exclusive: both of them
can never be satisfied simultaneously. For example, a composition can
be interesting if its domain and range coincide, but it can
also be interesting if its range is much more interesting than its domain.
Clearly, these two Interestingness features can't
both be true (``x=y" and ``x much more interesting than y" can't occur
simultaneously). If AM didn't have some systematic way to realize
this, however, it might create a new concept, called
Interesting-composition, defined as any composition satisfying both
of those features. But then this concept will be vacuous: \4no\0
operation can possibly satisfy that over-constrained definition; this
new concept will have no examples; it is the null concept; it is
trivially forgettable. Merely to think of it is a blot on AM's claim
to rationality. This is where the Conflict-matrix fits in.
The ``Conflict-matrix" is specified to prevent many such trivial
combinations from eating up a lot of AM's time.
If there are K features present
for the Interest facet of the concept, then its conflict-matrix will
be a K$\times $K matrix. In row i, column j of this matrix is a letter,
indicating the relationship between features i and j:
\noindent \6E \it \ Exclusive of each other: \rm they
both can't be true at the same time.
\noindent $\→$ \it Implies: \rm If feature i holds, then feature j must hold.
\noindent $←$ \it Implied by: \rm If feature j holds, then so does feature i.
\noindent \6= \it Equal: \rm Feature i holds precisely when feature j holds.
\noindent \6U \it \ Unrelated: \rm As far as known,
there is no connection between them.
\yskip
These little relations are utilized by some of the heuristic rules.
Here is one such rule. Its purpose is to create a new, specialized
form of concept C, if many examples of C were previously found very
quickly.
\han2{{\it IF Current-task is (Fillin Specializations of C)}}
\han3{{\it and $\|$C.Examples$\|$ > 30}}
\han3{{\it and Time-spent-on-C-so-far < 3 cpu seconds,}}
\han3{{\it and Interests(C) is not null,}}
\han2{{\it THEN create a new concept named Interesting-C,}}
\han3{{\it Defined as the conjunction of C.Defn and the highest-valued member of
Interests(C) which is {\6U}\ (unrelated) to any feature USED in the
definition of C.}}
\han3{{\it and add the following task to the agenda: Fillin examples of
Interesting-C, with value computed as the Value subpart of the chosen
feature, for this reason: ``Any example of Interesting-C is
automatically an interesting example of C".}}
\han3{{\it and add ``Interesting-C" to the USED subpart of the entry
where that
feature was originally plucked from.}}
\yskip
\SSSEC{Suggest}
This section describes a space-saving ``trick", and a ``fix-up" to undo
some potentially serious side-effects of that trick.
AM maintains two numeric threshholds:
Do-threshhold and a lower one, Be-threshhold.
When a new task is proposed, if its global priority is below
Be-threshhold, then it won't even be entered on the agenda. This
value is set so low that any task having even one mediocre reason
will make it onto the agenda.
After a task is finished executing, the top-rated one from the agenda
is selected to work on next. If its priority rating is below
Do-threshhold, however, it is put back on the agenda, and AM
complains that no task on the agenda is very interesting at the
moment. AM then spends a minute or so looking around for new tasks,
re-evaluating the priorities of the tasks on the agenda already, etc.
One way to find new tasks (and new reasons for already-existing
tasks) is to evaluate the ``\4Suggest\1'' facets of all the concepts in
the system. More precisely, each Suggest facet contains some
heuristics, encoded into LISP functions. Each function accepts a
number N as an argument (representing the minimum tolerable priority value
for a
new task), and the function returns as its output a list of new tasks.
These are then merged into the agenda, if desired.
Semantically, each function is one heuristic rule for suggesting a
new task which might be very plausible, promising, and \4a propos\0 at
the current time. For example, here is one entry from the Suggest
facet of Any-concept:
\yskip
\han3{{\it IF there are no examples for concept C filled in so far,}}
\han3{{\it THEN consider the task ``Fillin examples of C", for the following
reason: ``No examples of C filled in so far", whose value is half of
Worth(C). If that value is below arg1, then forget it; otherwise,
try to add to to the agenda.}}
\yskip
\noindent The argument ``arg1" is that low numeric value, N, supplied to the
Suggest facet.
This entry alone will produce a multitude of potential tasks; for
concepts whose Worth numbers are high, or for which a task is already
on the agenda to fill in their examples, these suggested tasks will
be remembered; most of the other ones will typically be forgotten.
One use of this facet is thus to ``beef up" the agenda whenever AM is
discontented with all the tasks thereon. At such a time, AM may call
on all the Suggest facets in the system, and a large volume of new
tasks will be added to the agenda. Many of them will exist there
already, but for different reasons, so many old tasks' priority
values will rise. After this period of suggesting is over, the
agenda's highest-ranking task will hopefully have a higher value than
any did before. Also at this time, the Be-threshhold and
Do-threshhold numbers are reduced. So there are two reasons why the
top task may now be rated higher than Do-threshhold. If it isn't,
then the threshholds are lowered again, and again all the Sugg facets
are triggered (this time with a lower N value).
Both threshholds are raised slightly every time AM succeeds in
picking and executing a task. So they follow a pattern of slow
increase, followed by a sudden decrement, followed by another slow
increase, etc. This was intended to mimic a human's increasing
expectations as he makes progress.
It also is suggestive of
the way a human strains his mind when an obstacle to that progress
appears; if the straining doesn't produce a brilliant new insight, he
grudgingly is willing to reduce his expectations, and perhaps resume
some ``old path" abandoned earlier.
\SSSEC{Fill/Check}
\qq{To doubt everything doesn't suffice; one must know {\sl why} he
doubts.}{Poincar\'e}
There is one more level of structure to AM's representation of a concept than
the simple ``properties on a property-list" image.
Each concept consists of a bunch of facets; each
facet follows the format layed down for it (and described in the
preceding several subsections).
Yet each facet of each concept can have two additional ``subfacets"
(little slots that are hung onto any desired slot) named \4Fill\-in\0 and
\4Check\1.
The ``Fill\-in" field of facet F of concept C is abbreviated
C.F.Fill\-in. The format of that subfield is a list of
heuristic rules,
encoded into LISP functions.
Semantically, each rule in C.F.Fill\-in should be relevant to
fill\-ing in entries for facet F of any concept which is a C.
This substructure is an implementation answer to the question of where to place
certain heuristic rules.
As an illustration, let me describe a typical rule which is found on
{\bf Com\-pose.Ex\-am\-ples.Fill\-in}.
According to the last paragraph, this must be useful for fill\-ing in
ex\-am\-ples of any operation which is a composition.
The rule says that if the composition A$\circ$B is
formed from two very time-consuming operations A and B, then it's worth
trying to find some ex\-am\-ples of A$\circ$B by symbolic means; in this case,
scan the ex\-am\-ples of A and of B, for some pair of ex\-am\-ples
x$→$y (ex\-am\-ple of B) and y$→$z (ex\-am\-ple of A). Then posit that
x$→$z is an ex\-am\-ple of A$\circ$B.
As another illustration, let me describe a typical rule which is found on
{\bf Compose.Conjec.Fill\-in}. It
says that one potential conjecture about a given
composition A$\circ$B is that it is unchanged from A (or from B). This happens
often enough that it's worth examining each time a new composition is made.
This rule applies precisely to the task of fill\-ing in conjectures about particular
compositions.
Let's take yet a third illustration.
The subfacet {\bf Any-Concept.Ex\-am\-ples.Fill\-in}\0
is quite large; it contains all the known
methods for fill\-ing in ex\-am\-ples of C (when all we know is that C is a concept).
Here are a few of those techniques\foo{ The interested reader will find them all listed
in Appendix 2.2.37.}:
\yskip
\noindent $\bullet $ Instantiate C.Defn
\noindent $\bullet $ Search the
ex\-am\-ples facets of all the concepts on Generalizations(C) for ex\-am\-ples of C
\noindent $\bullet $ Run some of the concepts named in
In-ran-of(C)\foo{ i.e., operations whose range
is C} and collect the resultant values.
\yskip
{\bf Any-Concept.Ex\-am\-ples.Check} is large for similar reasons.
A typical entry there says to examine each verified ex\-am\-ple of C: if
it is also an ex\-am\-ple of a specialization of C, then it must be removed
from C.Ex\-am\-ples and inserted\foo{ Conditionally. Since each concept is of finite
worth, it is allotted a finite amount of space. A random number is generated to
decide whether or not to actually insert this ex\-am\-ple into the Ex\-am\-ples facet of
the specialization of C. The more that specialized concept is ``exceeding its
quota",
the narrower the range that the random number must fall into to have that
new item inserted. The probability is never precisely 1 or 0. }
into the Ex\-am\-ples facet of that specialized concept.
Here is one typical entry from {\bf Operation.Domain/Range.Check}:
\yskip
\han3{{\it IF a domain/range entry has the form (D D D$\ldots \ \→\ $ R),}}
\han4{{\it and all the D's are equal, and R is a generalization of D,}}
\han3{{\it THEN it's worth seeing whether (D D D$\ldots \ \→\ $ D)
is consistent with all known ex\-am\-ples of the operation.}}
\han3{{\it If there are no known ex\-am\-ples,
add a task to the agenda requesting they be filled in.}}
\han3{{\it If there are ex\-am\-ples, and (D D D$\ldots \ \→\ $ D)
is consistent, add it to the Domain/range
facet of this operation.}}
\han3{{\it If there are some contradicting ex\-am\-ples, create a new concept which is defined as
this operation restricted to (D D D$\ldots \ \→\ $ D).}}
\yskip
\noindent Note that this ``Checking" rule doesn't just passively check the designated facet; it
actively ``fixes up" faulty entries, adds new tasks, creates new concepts, etc.
All the check rules are very aggressive in this way.
For ex\-am\-ple, one entry on {\bf No-multiple-elements-structure.Exam\-ples.Check}
will actually remove any multiple occurrences of an element from a structure.
The set Checks(C.F) of all relevant rules for
checking facet F of concept C is obtained as
(ISA's(C)).F.Check. That is, look for the Check subfacet of the
F facet of all the concepts on ISA's(C)).
Similarly, Fill\-ins(C.F) is the union of the Fill\-in subfacets of the F facets of
all the concepts on ISA's(C).
When AM chooses a task like ``\6Fill\-in ex\-am\-ples of Primes\1",
its first action is to
compute Fill\-ins(Primes.Exs).
It does this by asking for ISA's(Primes); that is, a list of all concepts of which
Primes is an ex\-am\-ple. This list is: <Objects Any-concept Anything>.
So the relevant heuristics are gathered from Objects.Exs.Fill\-in, etc.
This list of heuristics is then executed, in order
(last executed are the heuristics attached to Anything.Exs.Fill\-in).
\SSSEC{Other Facets which were Considered}
Most facets (like ``Definitions") were anticipated from the very beginning
planning of AM, and proved just as useful as expected.
Others (like ``Intuitions")
were also expected to be very important, yet were a serious disappointment.
Still others (like ``Suggest") were unplanned and grumblingly acknowledged as
necessary for the particular LISP program that bears the name AM.
Finally, we turn to a few facets which were initially planned, and yet which
were adjudged useless around the time that AM was coded. They were therefore
never really a part of the LISP program AM, although they figured in its
proposal. Let me list them, and explain why each one was dropped.
\yskip
$\bullet $ \2 UN-INTERESTINGNESS.\1 This was to be similar to the
Interest facet. It would
contain entries of the form feature/value/reason, where the feature would be
a \4bad\0 (dull, trivializing, undesirable, uninteresting) property that an
entity (a concept or a task) might possess.
If it did, then the value component would return a
negative number as its contribution to the worth/priority of that entity.
This sounded plausible, but turned out to be useless in practice:
(i) There were very few features one could point to which explicitly indicated
when something was boring; (ii) Often, a conjunction of many such features
would make the entity seem unusual, hence interesting; (iii) Most entities
were viewed as very mediocre unless/until specific reasons to the contrary,
and in those cases the presence a few boring properties would be outshadowed
by the few non-boring ones. In a sea of mediocrity, there is little need to
separate the boring from the very boring.
$\bullet $ \2 JUSTIFICATION.\1
For conjectures
which were not yet believed with certainty, this part would contain all the known
evidence supporting hem.
This would hopefully be convincing, if
the user (or a concept) ever wanted to
know.
In cases of contradictions arising
somehow, this facet was to keep hold of the threads that could be untangled
to resolve those paradoxes. As described earlier, this duty could naturally be
assumed by the Conjecs facet of each concept. The other intended role for this
facet was to hold sketches of the proofs of theorems.
Unfortunately, the intended concepts for Proof and Absolute truth were
never implemented, and thus most of the heuristic rules which would have interacted
with this facet are absent from AM. It simply was never needed.
$\bullet $ \2 RECOGNITION.\1
Originally, it was assumed that the location of relevant concepts and
their heuristics would be much more like a
free-for-all (pandemon\-i\-um) than an orderly rippling process.
As with the original use of BEINGs\foo{ Interacting knowledge modules, each module
simulating a different expert at a round-table meeting. See [Lenat 75b]. }, the
expectation was that each concept would have to ``shout out" its relevance whenever
the activities triggered some recognition predicate inside that concept.
Such predicates were to be stored in this facet.
But it quickly became apparent that the triggering predicates which were the
left-hand-sides of the heuristic rules were quick enough to obviate the need
for pre-processing them too heavily. Also, the only rules relevant to a given
activity on concept C always seemed to be attainable by rippling in a certain
direction away from C. This varied with the activity, and a relatively small table
could be written, to specify which direction to ripple in (for any given desired
activity). We see that for ``Fill-in examples of$\ldots $", the direction to ripple in
is ``Generalizations", to locate relevant heuristic rules.
For ``Judge interest of$\ldots $" the direction is also generalizations. For
``Access specializations of", the direction is Specializations, etc.
The only important point here is that the Recognition facet was no longer needed.
\SSEC{AM's Starting Concepts}
The first subsection presents a diagram of the top-level (general)
concepts AM started with, with the lines indicating the
Generalizations/Specializations kinds of relationships (single line
links) and a few Examples/Isa's links (triple vertical lines).
Several specific concepts have been omitted from that picture. All
the concepts initially fed to AM are then listed alphabetically and
described.
Finally, in Section 2.5.2.3, we discuss
the choice of starting concepts.
\SSSEC{Diagram of Initial Concepts}
\vskip 6in % Put the tree of concepts diagram here
The diagram above represents the ``topmost" concepts
which AM had initially,
shown connected via
Specialization links (single lines) and Examples links (triple lines).
The only concepts not diagrammed are the 47
\4examples\0 of the concept Operation.
Also, we should note that many entities exist in the system
which are not themselves concepts. For example, the number ``3", though it
be an \4example\0 of many concepts, is not itself a concept.
All entities which \4are\0 concepts are present on the list called
CONCEPTS, and they all have property lists (with facet names as the
properties). In hindsight, this somewhat arbitrary scheme is regrettable.
A more aesthetic designer might have come up with a more uniform system
of representation than AM's.
\SSSEC{Summary of Initial Concepts}
Since the precise set of concepts is not central to the design of AM,
or the quality of behaviors of AM, they are not worth detailing here.
On the other hand, a cursory familiarity with their names and
definitions should aid the reader in building up an understanding of
what AM has done. For that reason, the concepts will now be briefly
described, in alphabetical order.
\6ACTIVITY\0 represents something that can be ``performed". All
Actives --- and \4only\0 Actives --- have Domain/range facets and
Algorithms facets.
\6ALL-BUT-FIRST-ELEMENT\0 is an operation which takes an ordered
structure and removes the first element from it. It is similar in
spirit to the Lisp function ``CDR".
\6ALL-BUT-LAST-ELEMENT\0 takes an ordered structure and removes its
last element.
\6ANY-CONCEPT\0 is useful because it holds all the very general
tactics for filling in and checking each facet. The definition of
Any-concept is \6``$\lambda $ (x) x$\epsilon $CONCEPTS"\1.
`\6CONCEPTS\1' is AM's global
list of entities known to be concepts. Initially, this list contains
the hundred or so concepts which AM starts with (e.g., all those
diagrammed on the preceding page).
\6ANYTHING\0 is defined as \6``$\lambda $ (x) T"\1; i.e., a predicate which
will \4always\0 return true. Notice that the singleton $\{a\}$ is an
example of Anything, but (since it's not on the list \6CONCEPTS\0) it
is not an example of Any-concept.
\6ATOM\0 contains data about all primitive, indivisible objects
(identifiers, constants, variables).
\6BAG\0 is a type of structure. It is unordered, and multiple
occurrences of the same element are permitted. They are isomorphic to
the concept known as `multiset', except that we stipulate that sets
are \4not\0 bags.
\6BAG-DELETE\0 is an operation which takes two arguments, x and B.
Although x can be anything, B must be a bag. The procedure is to
remove one occurrence of x from B.
\6BAG-DIFF\0 is an operation which takes two bags B,C. It repeatedly
picks a member of C, and removes it (one occurrence of it) from both
B and C. This continues until C is empty.
\6BAG-INSERT\0 is an operation which adds (another occurrence of) x
into bag B.
\6BAG-INTERSECT\0 takes two bags B,C, and creates a new bag D. An
item occurs in D the \4minimum\0 number of times it occurs in either
B or C.
\6BAG-UNION\0 takes bag C and dumps all its elements into bag B.
\6CANONIZE\0 is both an example of and a specialization of
`Operation'. It accepts two predicates P1 and P2 as arguments, both
defined over some domain A$\times $A, where P1 is a generalization of P2.
Canonize then tries to produce a ``standard representation" for
elements of A, in the following way. It creates an operation f from A
into A, satisfying: \it
P1(x,y) iff P2(f(x),f(y))\1. Then any item of
the form f(x) is called a canonical member of A. The set of such
canonical-A's is worth naming, and it is worth investigating the
restrictions of various operations' domains and ranges to this set of
canonical-A's\foo{ i.e., take an operation which used to have ``A" as one
of its domain components or as its range, and try to create a new
operation with essentially the same definition but whose domain/range
says ``Canonical-A" instead of ``A". }. ``Canonize" contains lots of
information relevant to creating such functions f (given P1 and P2).
Thus Canonize is an example of the concept Operation. Canonize also
contains information relevant to dealing with any and all such f's.
So Canonize is a specialization of Operation.
\6COALESCE\0 admits the same duality\foo{ Both a specialization of
Operation and an example of Operation. }. This very useful operation
takes as its argument any operation F(a,b,c,d$\ldots $), locates two domain
components which intersect (preferably, which are equal; say the
second and third), and then creates a new operation G defined as
G(a,b,d$\ldots $) =$↓{df}$ F(a,b,b,d$\ldots $).
That is, F is called upon with a pair
of arguments equal to each other. If F were Times, then G would be
Squaring. If F were Set-insert, then G would be the operation of
inserting a set S into itself.
\6COMPOSITION\0 involves taking two operations A and B, and applying
them in sequence: A$\circ$B(x) =$↓{df}$ A(B(x)). This concept deals with (i)
the activity of creating new compositions, given a pair of
operations; (ii) all the operations which were created in this
fashion. That is why this concept is both a specialization of and an
example of Operation.
\6CONJECTURES\0 are a kind of object. This concept knows about --- and
can store --- conjectures. When proof techniques are inserted into AM,
this tiny twig of the tree of concepts will grow to giant
proportions.
\6CONSTANT-PREDICATE\0 is a predicate which can afford to have a very
liberal domain: it always ignores its arguments and just returns the
same logical value all the time.
\6DELETE\0 is an operation which contains all the information common
to all flavors of removing an element from a structure (regardless of
the type of structure which is being attenuated). When called upon
to actually perform a deletion, this concept determines the type of
structure and then calls the appropriate specialized delete concept
(e.g., Bag-delete).
\6DIFFERENCE\0 is another general operation, which accepts two
structures, determines their type (e.g., Bags), and then calls the
appropriate specialized version of difference (e.g., Bag-diff).
\6EMPTY-STRUCTURE\0 contains data relevant to structures with no
members.
\6FIRST-ELEMENT\0 is an operation which takes an ordered structure
and returns the first element. It is like the Lisp function `CAR'.
\6IDENTITY\0 is just what it claims to be. It takes one argument and
returns it immediately. The main purpose of knowing about this boring
transformation is just in case some new concept turns out
unexpectedly to be equivalent to it.
\6INSERT\0 takes an item x and a structure S, determines S's type,
and calls the appropriate flavor of specialized Insertion concept.
The general INSERT concept contains any information common to all of
those insertion concepts.
\6INTERSECT\0 is an operation which computes the intersection of any
two structures. It, too, has a separate specialization for Bags,
Sets, Osets, and Lists.
\6INVERT-AN-OPERATION\0 is a very active concept. It can invert any
given operation. If F:X$→$Y is an operation, then its inverse will be
abbreviated F\inv , and F\inv (y) is defined as all the x's in X for
which F(x)=y. The domain and range of F\inv \ are thus the range and
domain of F.
\6INVERTED-OP\0 contains information specific to operations which
were created as the inverses of more primitive ones.
\6LAST-ELEMENT\0 takes an ordered structure and returns its final
member.
\6LIST\0 is a type of structure. It is ordered, and multiple
occurrences of the same element are permitted. Lists are also called
vectors, tuples, and obags (for ``ordered bags").
\6LIST-DELETE\0 is an operation which takes two arguments, x and B.
Although x can be anything, B must be a list. The procedure is to
remove the first occurrence of x from B.
\6LIST-DIFF\0 is an operation which takes two lists B,C. It
repeatedly picks a member of C, and removes it (the first remaining
occurrence of it) from both B and C. This continues until there are
no more members in C.
\6LIST-INSERT\0 is an operation which adds (another occurrence of) x
onto the front of list B. It is like the Lisp function `CONS'.
\6LIST-INTERSECT\0 takes two lists B,C, and creates a new list D. An
item occurs in D the \4minimum\0 number of times it occurs in either
B or C. D is arranged in order as (a sublist of) list B.
\6LIST-UNION\0 takes list C glues it onto the end of list B. It's
like `APPEND' in Lisp.
\6LOGICAL-RELATION\0 contains knowledge about Boolean combinations:
disjunction, conjunction, implication, etc.
\6MULTIPLE-ELEMENTS-STRUCTURES\0 are a specialized kind of Struc\-ture.
They permit the same atom to occur more than once as a member. (e.g.,
Bags and Lists)
\6NO-MULTIPLE-ELEMENTS-STRUCTURES\0 are a specialization of
Structure. They permit the same atom to occur only once as a member.
(e.g., Sets and Osets)
\6NONEMPTY-STRUCTURES\0 are a specialization of Structure also. They
contain data about all structures which have some members.
\6OBJECT\0 is a general, static concept. Objects are like the
subjects and direct objects in sentences, while the Actives are like
the verbs\foo{ As in English, a particular Activity can sometimes itself
be the subject. }.
\6OBJECT-EQUALITY\0 is a predicate. It takes a pair of objects, and
returns True if (i) they are identical, or (ii) they are structures,
and each corresponding pair of members satisfies Object-Equality.
Often we'll call this `Equal', and denote it as `='.
\6OPERATIONS\0 are Actives which take arguments and return a value.
While a predicate examines its arguments and returns either True or
False, an operation examines \4its\0 arguments and returns any number
of values, of varying types. Assuming that the arguments lay in the
domain of the operation (as specified by some entry on its
Domain/range facet), then every value returned must lie within its
range (as specified by that same Domain/range entry).
\6ORDERED-PAIR\0 is a kind of List. It has just two `slots', however:
a front and a rear element.
\6ORDERED-STRUCTURE\0 is a specialized type of Structure. It includes
all structures for which the order of insertion of two members can
make a difference in whether the structures are equal or not.
Ordered-structures are those for which it makes sense to talk about a
front and a rear, a first element and a last element.
\6OSET\0 is a type of structure. It is ordered, and multiple
occurrences of the same element are not permitted. The
short-term-memory of Newell's PSG [Newell 73] is an Oset, as is a cafeteria
line. Not much use was found for this concept by AM.
\6OSET-DELETE\0 removes x from oset B (if x was in B).
\6OSET-DIFF\0 is an operation which takes two osets B,C. It removes
each member of C from B.
\6OSET-INSERT\0 is an operation which adds x to the front of oset B.
If x was in B previously, it is simply moved to the front of B.
\6OSET-INTERSECT\0 takes two osets B,C, and removes from B any items
which are \4not\0 in C as well. B thus `induces' the ordering on the
resultant oset.
\6OSET-UNION\0 takes oset C, removes any elements in B already, then
glues what's left of C onto the rear of B.
\6PARALLEL-JOIN\0 is an operation which takes a kind of structure and
an operation H. It creates a new operation F, whose domain is that
type of structure. For any such structure S, F(S) is computed by
appending together H(x) for each member x of S.
\6PARALLEL-JOIN2\0 is a similar operation. It creates an operation F
with two structural arguments. F(S,L) is computed by appending the
values of H(x,L), as x runs through the elements of S.\foo{Here, the
args to PARALLEL-JOIN2 are two types of structures SS and LL, and an
operation H whose range is also a structural type DD. Then a new
operation is created, with domain SS$\times $LL and range DD. }
\6PARALLEL-REPLACE\0 is an operation used to synthesize new
substitution operations. It takes a structural type and an operation
H as its arguments, and creates a new operation F. F(S) is computed
by simply replacing each member x of S by the value of F(x). The
operation produced is very much like the Lisp function MAPCAR.
\6PARALLEL-REPLACE2\0 is a slightly more general operation. It
creates F, where F(S,L) is computed by replacing each x$\epsilon $S by
F(x,L).
\6PREDICATES\0 are actives which examine their arguments and then
return T or NIL (True or False). It is only due to the
capriciousness of AM's initial design that predicates are kept
distinct from operations. Of course, each example of an operation
can be viewed as if it were a predicate; if F:A$→$B is any operation
from A to B, then we can consider F a relation on A$\times $B, that is a
subset of A$\times $B, and from there pass to viewing F as a (characteristic)
predicate F:A$\times $B$→\ \{$T,F$\}$. Similarly, any predicate on
$A\times \ldots \times B \times C$ may
be considered an operation (a multi-valued, not-always-defined
function) from $A\times \ldots \times B$ into $C$.
There are no unary predicates. If
there were one, say P:A$→\ \{T,F\}$, then that predicate would essentially
be a new way to view a certain subset of A; the predicate would then
be transformed into $\{a \epsilon A\ |\ P(a)\}$,
made into a new concept, tagged
as a specialization of A, and its definition would be
``$\lambda \, (a) \biglp A.Defn(a) \meet P(a)\bigrp $".
\6PROJECTION1\0 is a simple operation. It is defined as ``$\lambda $ (x y)
x\1." Notice that Identity is just a specialized restriction of
Proj1. Proj1(Me,You)=Me.
\6PROJECTION2\0 is a similar operation. Proj2(Me,You)=You.
\6RELATION\0 is any Active which has been encapsulated into a set of
ordered pairs. `Relation' bridges the gap between active and static
concepts.
\6REPEAT\0 is an operation for generating new operations by repeating
old ones. Given as its argument a structural type $\Sscr $ and an existing
operation H (with domain and range of the form
$\Sscr \times \Sscr \, \→\,\Sscr$),
Repeat($\Sscr $,H) synthesizes a brand new operation F. The domain/range of
F is just that of H. F(S) is computed by repeating TEMP$←$H(x,TEMP) for
each element x of S. TEMP is initialized as some member (preferably
the first element) of S.
\6REPEAT2\0 is similar, but requires that H take three arguments, and
it creates a new operation F,
where F(S,L) is computable by repeatedly doing
TEMP$←$H(x,TEMP,L).
\6RESTRICT\0 is an operation which turns out new
operations. Given as an argument some operation (or predicate) F,
RESTRICT would synthesize a new operation which
would have the same definition as F, but would have its domain and/or
range curtailed.
\6REVERSE-ORDERED-PAIR\0 transforms the ordered pair $<x,y>$ into
$<y,x>$.
\6SET\0 is a type of structure. It is unordered, and multiple
occurrences of the same element are not permitted.
\6SET-DELETE\0 is an operation which takes two arguments, x and B.
Although x can be anything, B must be a set. The procedure is to
remove x from B (if x was in B), then return the resultant value of
B.
\6SET-DIFF\0 is an operation which takes two sets B,C. It removes
each member of C from B.
\6SET-INSERT\0 is an operation which adds x to set B.
\6SET-INTERSECT\0 removes from set B any items which are \4not\0 in
set C, too.
\6SET-UNION\0 dumps into B all the members of C which weren't in
there already.
\6STRUCTURE\1, the antithesis of ATOM, is inherently divisible. A
structure is something that has members, that can be broken into
pieces. There are two questions one can ask about any kind of
structure: Is it ordered or not? Can there be multiple occurrences of
the same element in it or not? There are four sets of answers to
these two questions, and each of the four specifies a well-known kind
of structure (Sets, Lists, Osets, Bags).
\6STRUCTURE-OF-STRUCTURES\0 is a specialization of Structure, representing
those structures all of whose \4members\0 are themselves structures.
\6TRUTH-VALUE\0 is a specialized kind of atomic object. Its only
examples are True and False. This concept is the range set for all
predicates.
\6UNION\0 is a general kind of joining operation. It takes two
structures and combines them. Four separate variants of this concept
are given to AM initially (e.g., Set-union).
\6UNORDERED-STRUCTURE\0 is a specialized type of Structure. It
includes all structures for which the order of insertion of two
members never makes any difference in whether the structures are
equal or not. Unordered-structures cannot be said to have a front or a
rear, a first element or a last element.
\SSSEC{Rationale behind Choice of Concepts}
A necessary part of realizing AM was to choose a
particular set of starting concepts. But how should such a choice be
made?
My first impulse was to gather a \4complete\0 set of concepts. That
is, a basis which would be sufficient to derive all mathematics. The
longer I studied this, the larger the estimated size of this basis
grew. It immediately became clear that
this would never fit in 256k. \foo{ This is the size of the core memory
of the computer I had at my disposal. } One philosophical problem
here is that future mathematics may be inspired by some real-world
phenomena which haven't even been observed yet. Aliens visiting Earth
might have a different mathematics from ours, since their collective
life experiences could be quite different from we Terrans'.
Scrapping the idea of a sufficient basis, what about a necessary one?
That is, a basis which would be \4minimal\0 in the following sense:
if you ever removed a concept from that basis, it could never be
re-discovered. In isolated cases, one can tell when a basis is
\4not\0 minimal: if it contains both addition and multiplication,
then it is too rich, since the latter can be derived from the
former.\foo{ by AM, and by any mathematician. As Don Cohen points out,
if the researcher lacked the proper discovery methods, then he might
never derive Times from Plus. } And yet, the same problem about
``absoluteness" cropped up: how can anyone claim that the discovery of
X can \4never\0 be made from a given starting point? Until recently,
mathematicians didn't realize how natural it was to derive numbers
and arithmetic from set theory (a task which AM does, by the way)\foo{
The ``new math" is trying to get young children to do this as well;
unfortunately, no one showed the elementary-school teachers the
underlying harmony, and the results have been saddening. }. So 50
years ago the concepts of set theory and number theory would both
have been undisputedly placed into a ``minimal" basis. There are thus
no absolute conceptual primitives; each culture (perhaps even each
individual) possesses its own basis.
Since I couldn't give AM a minimal basis, nor a complete one, I
decided AM might as well have a \4nice\0 one. Although it can never
be minimal, it should nevertheless be made very small (order of
magnitude: 100 concepts). Although it can never be complete, it
should suffice for re-discovering much of already-known mathematics.
Finally, it should be \4rational\1, by which I mean that there should
be a simple rule for deciding which concepts do and don't belong in
that basis.
The concepts AM starts with are meant to be those possessed by young
children (age 4, say). This explains some omissions of concepts which
would otherwise be considered fundamental: (i) Proof and techniques
for proof/disproof; (ii) Abstract properties of relations, like
associativity, single-valued, onto; (iii) Cardinality, arithmetic;
(iv) Infinity, continuity, limits. The interested reader should see
[Piaget 55] or [Copeland 70].
Because my programming time and the PDP-10's memory space were both
quite small, only a small percentage of these `pre-numerical'
concepts could be included. Some unjustified omissions are: (i)
visual operations, like rotation, coloration; (ii) Games, rules,
procedures, strategies, tactics; (iii) Geometric notions, e.g.,
outside and between.
AM is not supposed to be a model of a child, however. It was never
my intention (and it would be much too hard for me) to try to emulate
a human child's whimsical imagination and emotive drives. And AM is
not ripe for ``teaching", as are children.\foo{ Learning psychologists
might label AM as neo-behavioristic and cognitivistic. See
[Le\-Fran\-cois 72]. }
Also, though it possesses a child's ignorance of most concepts, AM is given
a large body of sophisticated ``adult" heuristics. So perhaps
a more faithful image is that of Ramanujan, a
brilliant modern mathematician who received a very poor education,
and was forced to re-derive much of known number theory all by
himself. Incidentally, Ramanujan never did master the concept of formal proof.
There is no formal justification for the particular set of starting
concepts. They are all reasonably primitive (sets, composition), and
lie several levels ``below" the ones which AM managed to ultimately
derive (prime factorization, square-root). It might be valuable to
attempt a similar automated math discoverer, which began with a very
different set of concepts (e.g., start it out as an expert in lattice
theory, possessing all known concepts thereof). The converse kind of
experiments are to vary the initial base of concepts, and observe the
effects on AM's behavior. A few experiments of that form are
described in Section 6.2.